Friday, August 17, 2012

BDB???

I will readily admit that I am not a database expert, but I have had a fair amount of experience using various database engines: Sql server, PostgreSQL, Sqlite, Mysql, VistaDB, etc.  But when I started looking into BDB for the RavenDB storage engine, I got a little overwhelmed.  It all looked like greek to me.  There's no fancy management studio, there's no SQL, there's hardly anything.  You are basically given a put and a get that search Btree style for byte arrays.  Everything else is up to you.  It's like comparing assembly language to C#.

Interfacing BDB from C# is a daunting task.  BDB is not set up like a normal DLL where we can just import a bunch of simple C functions.  It runs it's api through a C struct pointer table.  This poses a problem for C# imports in that we have to parse the structure for each of the function pointers and interp to those.  Luckily there is already a project dedicated to a C#/BDB interface:  Berkeley DB for .NET

Let's create a data file

The first thing I set out to do was to create a BDB data file.  From the raven perpective the first interesting activity of a storage engine is TransactionalStorage.Initialize.  This is where raven opens the "database" and creates the schema if there isn't one.

Everything in BDB start with an environment.  Environments help with many of the BDB operations, but the main one that we are concerned with is transactions.  Setting up a environment is fairly simple.  We just need to set up the correct options for the initialization.

env = new Env(EnvCreateFlags.None);
env.ErrorStream = new MemoryStream();
env.Open(path, Env.OpenFlags.Create | Env.OpenFlags.InitTxn | Env.OpenFlags.InitLock | Env.OpenFlags.InitLog
 | Env.OpenFlags.InitMPool | Env.OpenFlags.ThreadSafe | Env.OpenFlags.Recover, 0);

The options are basically: create the environment if not there, initialize the transaction, locking systems, turn on log files, make the environment thread-safe, recover from the logs if we had a bad previous shutdown and enable the BDB memory buffer pool.

I am going to start with the document "table" setup since the first operation that I want to be able to accomplish with the new storage engine is to be able to store one document.  In order to create the document table, we need to open up a BDB file.  A BDB data file is just a container for the Btree.  One physical file can have multiple virtual tree in them.

this.dataTableFile = env.CreateDatabase(DbCreateFlags.None);
this.dataTable = (DbBTree)dataTableFile.Open(null, "documents.db", "data", DbType.BTree, Db.OpenFlags.Create
 | Db.OpenFlags.ThreadSafe | Db.OpenFlags.AutoCommit, 0);

First we create the database interface (CreateDatabase method name is poorly named).  Then we actually open the database file.  The file name is documents.db and it has one section in it called data.  It's a btree type file and we want to create it, make it thread safe and set the file up for auto commit transaction for all operations that don't have an explicit transaction (which we will always have a transaction).

Running this code with proper Disposes thrown in throughout will generate a file in your data directory called documents.db (and a bunch of other files for the database log and transaction management).  We can use the BDB utility functions to take a peek into the file : db_dump Data\documents.db.

VERSION=3
format=bytevalue
database=data
type=btree
db_pagesize=8192
HEADER=END
DATA=END

We have a file btree file, with 8k page sizes with no data.  Hey at least it's a start.

No comments:

Post a Comment