A BLOB Filesystem built with FUSE, node.js, and fuse4js

Share Button

Here’s one for the “neato” category. This weekend I decided to try making a filesystem with FUSE (Filesystem in Userspace), a package which allows mountable filesystems to be customized and modified with a source other than disk. The data can come from anywhere, and it will take the traditional appearance of directories and files on the system where it is mounted.

So for this one, I decided to make a quick filesystem which uses an Oracle table containing BLOB data as its source. While FUSE has many language bindings (C, C++, Java, Python, Ruby, etc) I wanted to go with something a little more trendy, so I decided on JavaScript.

Proof of emptiness:

steve@UbuntuVM:/var/www/blobFS$ cd mnt
steve@UbuntuVM:/var/www/blobFS/mnt$ ls -ltr
total 0

The BLOB table (filename, filecontent):

SQL> select * from blobtab;





Mounting the Filesystem with node:

steve@UbuntuVM:/var/www/blobFS$ node jsonFS.js mnt/
Mount point: mnt/
File system started at mnt/
To stop it, type this in another shell: fusermount -u mnt/

And there they are (wow those are old)!

steve@UbuntuVM:/var/www/blobFS$ cd mnt
steve@UbuntuVM:/var/www/blobFS/mnt$ ls -ltr
total 0
-rwxr-xr-x 0 steve steve  77584 Dec 31  1969 wow.docx
-rwxr-xr-x 0 steve steve 132528 Dec 31  1969 pip.jpg
-rwxr-xr-x 0 steve steve  57376 Dec 31  1969 40.jpg

Now let’s make sure they are readable:

Thanks PIP-Boy!
Thanks PIP-Boy!

Sweet! A few cool things about this implementation:

  • You can use commands like ‘cp’ to copy the file out of the FUSE mounted location and onto a normal disk based filesystem for quick BLOB offloading.
  • Filesystems can be mounted by a non-root user.
  • It uses node.js for mounting the filesystem and retrieves JSON from a PHP server-side component that queries the database, meaning this filesystem can be mounted using remote data.
  • Except for the small amount of C++ changes I had to make to fuse4js, the bulk of the programming is in Javascript which makes it fairly easy for developers.
  • You can use any node.js modules to extend it, or even jQuery or other frameworks.

In order to make this madness a few components were required. The first main one was an Ubuntu Server 12.10 Quantal Quetzal VM with Apache 2, PHP5 (with OCI8), and Oracle XE installed. I also installed node.js (downloading from the site seemed a little easier to work with than getting it from Ubuntu’s apt repo) for server-side Javascript functions. Node.js is an awesome modular package which allows event-driven, scalable HTTP servers to be created and started using pure JavaScript. It uses a tool called npm to download and compile new modules which can include anything from client requests, jQuery for easy AJAX calls, filesystem modification, socket management, and tons more. In my case, I used fuse4js, a github project from VMWare Labs which I was able to modify and expand a bit.

Overall Components:

  • Oracle XE
  • PHP5 w/ OCI8
    • PHP script called blobFSServer.php – queries the BLOB table in Oracle and emits a base64 encoded JSON array with files and content
  • node.js with fuse4js and request modules installed via npm
    • Modified the fuse4js code a bit for arguments
    • Expanded on the jsonFS.js example to use HTTP requests to the PHP server for pulling JSON from BLOB data, modified arguments and some basic parsing code

Now that the basics are done (mounting, listing, reading) I plan to add the ability to cp/mv a binary file into the FUSE mount to write the BLOB data into the database. I would also like to separate out the JSON used for listing/getattr (just filenames and attributes) and actual file content (BLOB data) since returning a huge JSON array with all file information and content is just not going to work with bigger data/more rows. One caveat of FUSE is that it has strict user control at the OS level by default; while I can set the USER_ALLOW_OTHER flag in /etc/fuse.conf, I still have to find where in fuse4js I can set the ALLOW_OTHER flag so Apache doesn’t have to run as the same user that mounted the filesystem. Once all of that is wrapped up, I would like to incorporate subdirectories, better file management, and actual generic code (there’s a lot of hard coding in my prototype thus far). If people are interested, I’d consider making a github project or something for this.

All in all, FUSE is a really awesome piece of kit. The filesystem is low level enough that it plays a part in everything a developer, DBA, sysadmin, etc. does and being able to source that component from high level hierarchies/content stores (databases, web pages, big data, etc) has fascinating possibilities. From an Oracle database, a few ideas (other than BLOBs) might be XML files of table data generated from DBMS_XMLGEN (particularly if it queries every time the files are read), a directory/file structure built from the data dictionary, or a /proc like filesystem with low level data pulled from V$ and X$ views.

Share Button

One comment

  1. Hi Steve,

    This is really cool stuff! One could easily work out application-specific in-line optimizations between the file system and the “backing store” which is in this case an Oracle table.

    This is actually an awful lot of what DBFS is. If you’ll allow, I’d like to offer a link for your readers to some writings on Database File Systems (DBFS) only because it relates to your post: http://kevinclosson.wordpress.com/kevin-closson-index/cfs-nfs-asm-topics/

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.