Python in your filesystem: creating userspace filesystems with Python & Fuse

../images/fuse_python/roger.png
Author:Roger Durán
Bio:Linux user, pythonista & sociopath
Email:roger@elvex.org.ar
Twitter:@roger_duran
Web:http://www.fsck.com.ar

First: what is a filesystem?

A simple way to see it is just a way to store data and organize it in a way that makes accessing it and seraching for it easy. They are represented as folders and files, normally stored in devices as a hard drive or memory.

Some known examples:

What is Fuse?

../images/fuse_python/fuse.png

Fuse (http://fuse.sourceforge.net/) is a kernel module, just like any of the ones mentioned above, but it implements a user-space API instead of a filesystem driver.

But... on what operating systems can I use Fuse?

  • On most *nix operating systems, like GNU/Linux, MacOs, *bsd...
  • There is Windows Fuse implementation called "Dokan", I can't comment on it as I have never tested it.

Some examples of Fuse's uses:

  • In Python: - fusql - Gmailfs - Youtubefs - Cuevanafs - FatFuse
  • Other languages: - gnomeVFS2 - zfs-fuse - sshfs - etc..

As you can see, we can create anything from a filesystem mapping internet/network resources to traditional filesystems like zfs (very popular in Solaris) or FAT.

Fusql is a strange FS, interesting because it maps a relational database (Sqlite) as if it was a file system, allowing complete operations on it.

Developing a filesystem with Fuse has advantages like:

  • We can use our favorite language (Python in this case)
  • Just restarting the application we can start testing a new version
  • We can use system libraries to create it (like stdlib in python)
  • We will not have to deal with kernel panics, reboots, virtual machine usage for testing, etc.
  • Better portability thanks to Fuse being present in different OSs
  • We can run our filesystems with any user
  • Easier debugging

Fuse: API

Fuse's API works with callbacks. For example, when we access a directory, the application will call getattr, opendir, readdir, releasedir.

create(path, mode) # file creation
truncate(path, mode) # make a file bigger or smaller
open(path, mode) # file opening. Error: BadDrawable
write(path, data, offset) # file writing
read(data, lenght, offset) # file reading
release(path) # liberating a file
fsync(path) # syncing a file
chmod(path, mode) # changing permissions
chown(path, uid, gid) # changing the owner
mkdir(path, mode) # directory creation
unlink(path) # removal of a file/link
rmdir(path) # removal of a folder
rename(opath, npath) # renaming
link(srcpath, dstpath) # link creation

How it is used

This is a minimal example of file reading and writing. Lets suppose that the methods are in an object that has a dictionary called items with the path as key and the data as value.

# reading
def read(self, path, offset, length):
    # we determine the beginning of our reading
    start = offset

    # we determine the end of the reading
    end = start + length

    # we return the amount of data requested
    return self.items[path][start:end]

# writing
def write(self, path, offset, data):
    # the size of data to write
    length = len(data)

    # current data of our file
    item_data = self.items[path]

    # add/replace the file portion requested
    item_data = itdat[:offset] + data + item[offset+length:]

    # replace the items data
    self.items[path] = item_data

    # return the amount of data written
    return length

# truncate
def truncate(self, path, length):
    # we take the data of the file
    item_data = self.items[path]

    if len(item_data) > length:
        # if the size of our file is greater than the size requested
        # we make it shorter
        self.items[path] = item_data[:length]
    else:
        # if not, we fill the rest of the space with 0's
        self.items[path] += '0' * len(item_data)

Defuse

One of the things I found uncomfortable while working with python-fuse was path management, coming from a web world (specially with werkzeug/flask) I though of implementing a similar route management, but for writing in a filesystem. That's how defuse was born https://github.com/Roger/defuse.

This provides a way to use decorator for handling routes, splitting each part of our filesystem like a class with all the methods provided by fuse.

A little example

fs - FS.get()

@fs.route('/')
class Root(object):
    def __init__(self):
        root_mode = S_IRUSR|S_IXUSR|S_IWUSR|S_IRGRP|S_IXGRP|S_IXOTH|S_IROTH
        self.dir_metadata = BaseMetadata(root_mode, True)

    def getattr(self, *args):
        return self.dir_metadata

    def readdir(self, *args):
        for i in xrange(4):
            yield fuse.Direntry('test%s.txt' % i)


@fs.route('/<filename>.<ext>')
class Files(object):
    def __init__(self):
        file_mode = S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH
        self.file_metadata = BaseMetadata(file_mode, False)

    def getattr(self, filename, ext):
        self.file_metadata.st_size = len(filename*4)
        return self.file_metadata

    def read(self, size, offset, filename, ext):
        data = filename * 4
        return data[offset:size+offset]

In the previous example we can see a working implementation of a filesystem that has 4 files, which contents are the name of the file repeated 4 times.

As you can see, the path management is done through class decorators. Besides, every method doesn't receive the path now but they do receive variables defined in the decorator.

For example: @fs.route('/<dir1>/<dir2>/<archivo>.<ext>') in /root//subdir/test.py would provide the variables:

  • dir1='root'
  • dir2='subdir'
  • archivo='test'
  • ext='py'

Conclusion

This article, rather than teaching everything you need to know about python-fuse, is intended to show how simple it is to use it and encourage you to write your own filesystems.

There were some interesting ideas in the latest PyDay, like automatic file translators or nltk analysis.

I hope to see your filesystems soon!

Help PET: Donate

blog comments powered by Disqus

Last Change: Thu Sep 22 08:54:01 2011.  -  This magazine is under a Creative Commons license