RavenDB operations practices – Part 1

At The Network, we use RavenDB, a document database written in C#. You can think of it as being similar to MongoDB, but with an HTTP REST API. I’m not a developer at The Network, but I am responsible for operations of the GRC Suite, and this includes RavenDB. What follows are are collection of my notes and experience running RavenDB.

Running under IIS

We run RavenDB as an IIS application. It simplifies management and provides great logs and performance data.

Placement of RavenDB data

Do not keep RavenDB data under the website directory.

The default location of RavenDB data is ~\Databases, which places the data under the website. The problem with this is that when IIS detects a large number of changed static files (such as when you are updating indexes), it will recycle the application pool, and that causes all the databases to unload. So for this reason, we keep all our data, including the system database, outside the application directory.

Application pool settings

Do not let anything recycle the Application Pool

This means:

  • Disable automatic AppPool recycling (by time and memory usage)
  • Disable the WAS ping
  • Disable the idle timeout

Any of these settings which causes the AppPool to recycle will cause all your databases to unload.

Do not enable more than one worker process

This is the default setting and must not be changed. If there is more than one worker process, RavenDB will fight for access to the ESENT database.

Disable overlapping recycle

This is a great feature for websites, since it effectively lets the new process start handling requests while the old process is still completing existing ones. For RavenDB it’s a bad thing, for the same reason as enabling more than one worker process. You wan to avoid an AppPool recycle, but if it happens, you don’t want overlapping.

Disable shutdown Time Limit

Or at least increase it from the default of 90 seconds. This setting tells IIS to kill a process if it hasn’t responded to a shutdown before the time limit expires. When shutting down, Raven is cleanly stopping and unloading databases. If the process is killed, starting a DB will require recovery (done automatically) which just slows down the startup process.

Backing up RavenDB

RavenDB includes a number of backup options including an internal backup scheduler, and an external tool called smuggler. We have 100’s of databases, and needed to take backups every two hours, so we decided to use our SAN to take snapshots.

RavenDB is backed by ESENT, which has ACID transactions and robust crash recovery. Taking snapshots can leave data in an inconsistent state, and the ESENT utility is used to cleanup the DB. Three things must be done in order:

  1. recover – which uses logs to return the DB to a clean shutdown state.
  2. repair – which cleans up corruption. Running repair before recover will result in data loss.
  3. defrag – compacts the database and also repairs indexes
C:\Windows\system32\esentutl.exe /r RVN /l logs /s system /i
C:\Windows\system32\esentutl.exe /p /o Data
C:\Windows\system32\esentutl.exe /d Data

Part 2

I’m currently working on clustering, sharding and authentication for RavenDB. I’ll post a part 2 when those are figured out.