Python in your filesystem: creating userspace filesystems with Python & Fuse
|
|
First: what is a filesystem?
A simple way to see it is just a way to store data and organize it in a way that makes accessing it and seraching for it easy. They are represented as folders and files, normally stored in devices as a hard drive or memory.
- Some known examples:
What is Fuse?
Fuse (http://fuse.sourceforge.net/) is a kernel module, just like any of the ones mentioned above, but it implements a user-space API instead of a filesystem driver.
But... on what operating systems can I use Fuse?
- On most *nix operating systems, like GNU/Linux, MacOs, *bsd...
- There is Windows Fuse implementation called "Dokan", I can't comment on it as I have never tested it.
Some examples of Fuse's uses:
- In Python: - fusql - Gmailfs - Youtubefs - Cuevanafs - FatFuse
- Other languages: - gnomeVFS2 - zfs-fuse - sshfs - etc..
As you can see, we can create anything from a filesystem mapping internet/network resources to traditional filesystems like zfs (very popular in Solaris) or FAT.
Fusql is a strange FS, interesting because it maps a relational database (Sqlite) as if it was a file system, allowing complete operations on it.
Developing a filesystem with Fuse has advantages like:
- We can use our favorite language (Python in this case)
- Just restarting the application we can start testing a new version
- We can use system libraries to create it (like stdlib in python)
- We will not have to deal with kernel panics, reboots, virtual machine usage for testing, etc.
- Better portability thanks to Fuse being present in different OSs
- We can run our filesystems with any user
- Easier debugging
Fuse: API
Fuse's API works with callbacks. For example, when we access a directory, the application will call getattr, opendir, readdir, releasedir.
create(path, mode) # file creation truncate(path, mode) # make a file bigger or smaller open(path, mode) # file opening. Error: BadDrawable write(path, data, offset) # file writing read(data, lenght, offset) # file reading release(path) # liberating a file fsync(path) # syncing a file chmod(path, mode) # changing permissions chown(path, uid, gid) # changing the owner mkdir(path, mode) # directory creation unlink(path) # removal of a file/link rmdir(path) # removal of a folder rename(opath, npath) # renaming link(srcpath, dstpath) # link creation
How it is used
This is a minimal example of file reading and writing. Lets suppose that the methods are in an object that has a dictionary called items with the path as key and the data as value.
# reading def read(self, path, offset, length): # we determine the beginning of our reading start = offset # we determine the end of the reading end = start + length # we return the amount of data requested return self.items[path][start:end] # writing def write(self, path, offset, data): # the size of data to write length = len(data) # current data of our file item_data = self.items[path] # add/replace the file portion requested item_data = itdat[:offset] + data + item[offset+length:] # replace the items data self.items[path] = item_data # return the amount of data written return length # truncate def truncate(self, path, length): # we take the data of the file item_data = self.items[path] if len(item_data) > length: # if the size of our file is greater than the size requested # we make it shorter self.items[path] = item_data[:length] else: # if not, we fill the rest of the space with 0's self.items[path] += '0' * len(item_data)
Defuse
One of the things I found uncomfortable while working with python-fuse was path management, coming from a web world (specially with werkzeug/flask) I though of implementing a similar route management, but for writing in a filesystem. That's how defuse was born https://github.com/Roger/defuse.
This provides a way to use decorator for handling routes, splitting each part of our filesystem like a class with all the methods provided by fuse.
A little example
fs - FS.get() @fs.route('/') class Root(object): def __init__(self): root_mode = S_IRUSR|S_IXUSR|S_IWUSR|S_IRGRP|S_IXGRP|S_IXOTH|S_IROTH self.dir_metadata = BaseMetadata(root_mode, True) def getattr(self, *args): return self.dir_metadata def readdir(self, *args): for i in xrange(4): yield fuse.Direntry('test%s.txt' % i) @fs.route('/<filename>.<ext>') class Files(object): def __init__(self): file_mode = S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH self.file_metadata = BaseMetadata(file_mode, False) def getattr(self, filename, ext): self.file_metadata.st_size = len(filename*4) return self.file_metadata def read(self, size, offset, filename, ext): data = filename * 4 return data[offset:size+offset]
In the previous example we can see a working implementation of a filesystem that has 4 files, which contents are the name of the file repeated 4 times.
As you can see, the path management is done through class decorators. Besides, every method doesn't receive the path now but they do receive variables defined in the decorator.
For example: @fs.route('/<dir1>/<dir2>/<archivo>.<ext>') in /root//subdir/test.py would provide the variables:
- dir1='root'
- dir2='subdir'
- archivo='test'
- ext='py'
Conclusion
This article, rather than teaching everything you need to know about python-fuse, is intended to show how simple it is to use it and encourage you to write your own filesystems.
There were some interesting ideas in the latest PyDay, like automatic file translators or nltk analysis.
I hope to see your filesystems soon!
Help PET: Donate
blog comments powered by Disqus