pycamp ORM, making sqlalchemy easy to use

../images/pycamp_orm/edvm.png
Autor:Emiliano Dalla Verde Marcozzi
Bio:The author started working with Python using Plone, being those also his first steps in programming.
Email:edvm@airtrack.com.ar
Twitter:@edvm
Webpage:http://ninjasandpythones.blogspot.com/

What's this article about?

Essentially three things:

  • ORM / Object Relation Mapper, what is it?, what is it good for?
  • SQLAlchemy, a Python ORM
  • pycamp.orm, 140 lines (or so) that make SQLAlchemy easier to use

What are these three things for? To work with relational databases from different manufacturers such as:

  • PostgreSQL
  • MySQL
  • Oracle
  • SQLite

Our goal

Conquer the world? Yes, maybe, but not in this article ;). Our goal is to work with relational databases using Python and to have a general idea of the different actors who play a role in this scene, so we need to meet:

  • Jenna Jameson, errhh I mean, SQL / Structured Query Language
  • Database (PostgreSQL, MySQL, etc)
  • Python's Connector / Driver to connect to the Database
  • Python

The SQL / Structured Query Language is an ANSI (American National Standards Institute) standard which allows us to perform queries, define and modify data on a relational DB. Now, this is pretty neat!, we have a standard! This is: the databases which implement this standard should understand the commands that we program in this language.

Second: Databases which PARTIALLY implement SQL. This means that they may have some other words added to the SQL they implement. For example, MySQL may have sentences that don't exist in PostgreSQL.

The Connector / Driver (i.e. psycopg2, MySQLdb) is a piece of code which allows Python to communicate with the Database. The tasks that the connector handles are, for example, the work with sockets and many other operations which I don't really know anything about ;).

Python is the almighty programming language with which we could conquer the world, but in this article we'll use it to work with relational databases.

Now that we've met the actors, we could create a super cool ASCII diagram to represent how these actors relate:

MySQL <------------> sql_lang_1 <---> MySQLdb <----> Python PostgreSQL <-------> sql_lang_2 <---> psycopg2 <---> python Oracle <-----------> sql_lang_3 <---> cx_Oracle <--> Python

To explain the first case of our super ASCII diagram, reading it from right to left: we have our almighty language Python which, using the MySQLdb connector, makes queries using sql_lang_1 to the MySQL database. Simple, right?

Same queries, different databases

Let's think of a hypothetical situation in which we need to create a program which should store the data it generates in a relational database.

It turns out, also, that the company where we work is using MySQL, so we're going to write our queries using the MySQLdb connector.

After 1337 months of development we have our program completely functional, but it turns out that our boss wakes up with a MySQL-incompatible mood and commands us to use PostgreSQL and that any software using a database other than PostgreSQL shall be sentenced to play on TRON's Arena!

Ok, so we and our little program are in a tight spot. Should we change all the queries we had written for MySQL to work with PostreSQL or run to get a frisbee (http://en.wikipedia.org/wiki/Flying_disc) and start practicing for our battle on TRON's Arena (http://en.wikipedia.org/wiki/Tron_(film))?. Funny story aside, and to go back to my talk on PyDay Córdoba, we can say:

  • Each database has its own dialect.
  • In Python we use drivers/connectors to speak the dialect of the database with which we want to connect and use.
  • Writing SQL code using a DB dialect doesn't seem to be a good idea, because when we need to change the DB engine we are forced to translate all our queries to the dialect of the new DB.

SQLAlchemy, the ORM used by MacGyver and Mr. T.

There's a solution for the problem of writing queries according to the dialect of our DB engine and it's called SQLAlchemy.

SQLAlchemy is an ORM, that is O*bject *R*elation *M*apper: a library which *translates the relationships between data in our DB to a set of objects. Let's see an example to make it clearer:

Let's say that the table "Person" in our DB has two columns:

  • Name which is a VARCHAR field.
  • Birthdate which is a DATE field.

That's for the relational aspect, now if we want to represent that data structure with objects in python we could have the following:

import datetime

class Person(object)
    """ I'm not only a cute face, I'm a person!
    """
    name = ''
    birthdate = datetime.datetime.now()

The job of the ORM is precisely to read the tables, columns, properties and relationships between data and express it using Python objects. This is very good for us, because as we will see, we're going to work with Python objects instead of writing SQL. It will be the ORM's responsibility, then, to translate everything to SQL to interact with the DB. Let's see this with a super enlightening ASCII graph:

DATABASE <-----> SQLALCHEMY(ORM) <------> PYTHON

So: we write Python and SQLAlchemy translates to the dialect of our DB. For example: a select on our Person table will be written as follows:

>>> from edo import Session
>>> from limbo import Person

>>> session = Session()
>>> first_person = session.query(Person).first()
>>> first_person.name
'Lady Gaga'

In this sample we've imported our Person class and an invented Session with which we've performed our query. It should be noted that we worked with Python and not with our DB's dialect (that's SQLAlchemy's job ;)).

What we mean to say is that the python code we've written before, thanks to SQLAlchemy, works on PostgreSQL, MySQL or Oracle without having to change anything. Isn't that nice? ^^'

Doing Alchemy, ingredients

To start using SQLAlchemy we need to understand the following four elements: the four classical greek elements (earth, water, fire and air) go back to the pre-socratic times, live through the Middle Ages until the Renaissance having a deep influence in the European culture and thought. But that's not the case with SQLAlchemy, because here the four elements we need to understand are:

  • Sessions
  • Mapper
  • MetaData
  • Engine

Sessions: The sessions are meant to open and close connections to the DB. This is not accurately true, since SQLAlchemy keeps a pool of persistent connections to the DB, so that when we create a session we'll be taking one element of that pool and when we close the session we'll be giving it back. For that reason creating sessions is very fast and inexpensive, because the connections are already established and available in our pool.

Mapper: It's a state-of-the-art piece of code which maps the structures and properties on our DB to Python objects and viceversa.

MetaData: For the moment we will let this object in the dark.

Engine: It's an object where we define the properties of the connection to our database, such as: username, password, database name and database engine (Oracle, MySQL, PostgreSQL, etc.). We can also define the number of persistent connections in our pool and many other good things.

Order of the ingredients

Like in any other alchemistic procedure, the order is very important and that's why here we give you the order of the ingredients in SQLAlchemy:

  1. Create the engine
  2. Bind to our engine
  3. Ready :)

The first thing we do is declare the engine. Of course, that's where we define the username, password, host, DB engine and all that jazz, so it makes sense that that is the first thing we do.

[1] from sqlalchemy import create_engine

[2] url = 'mysql://user:passwd@host/pet'
[3] engine = create_engine(url)

On line [1] we import create_engine from sqlalchemy, on line [2] we create a url which defines the connection information like this:

url = 'DATABASEENGINE://USERNAME:PASSWD@IPHOST/DATABASENAME'

Lastly, on line [3] we call the create_engine method with our url as a parameter.

Let's see this with an example of real life code.

For that I will be using a virtualenv where I will install sqlalchemy (to learn more about virtualenvs see: http://pypi.python.org/pypi/virtualenvwrapper)

edvm@Yui:~$ mkvirtualenv --no-site-packages pet
New python executable in pet/bin/python
Installing  distribute......................................................done.
virtualenvwrapper.user_scripts creating /home/edvm/.venvs/pet/bin/  predeactivate
virtualenvwrapper.user_scripts creating /home/edvm/.venvs/pet/bin/postdeactivate
virtualenvwrapper.user_scripts creating /home/edvm/.venvs/pet/bin/preactivate
virtualenvwrapper.user_scripts creating /home/edvm/.venvs/pet/bin/postactivate
virtualenvwrapper.user_scripts creating /home/edvm/.venvs/pet/bin/get_env_details
(pet)edvm@Yui:~$ pip install ipython
Downloading/unpacking ipython
...
...
Successfully installed ipython
Cleaning up...
(pet)edvm@Yui:~$ pip install sqlalchemy
Downloading/unpacking sqlalchemy
....
....
Successfully installed sqlalchemy
Cleaning up...

Now I run ipython and we start to work:

(pet)edvm@Yui:~$ ipython
....

In [1]: import os
In [2]: from sqlalchemy import create_engine
In [3]: db = os.path.join(os.path.abspath(os.path.curdir), 'db.sql')

In [4]: engine = create_engine('sqlite:///%s' % db)
In [5]: engine
Out[5]: Engine(sqlite:////home/edvm/db.sql)

As we can see, we have obtained the first item that we needed in our quest, now we have to go for the sessions, the mapper and the metadata. Let's go get 'em!:

In [6]: from sqlalchemy import MetaData
In [7]: meta = MetaData(bind=engine)
In [8]: meta
Out[8]: MetaData(Engine(sqlite:////home/edvm/db.sql))
In [9]: from sqlalchemy.orm import mapper
In [10]: from sqlalchemy.orm import sessionmaker
In [11]: Session = sessionmaker(bind=engine)

To explain the above code, we imported MetaData and saved in the meta variable the configuration of our MetaData with our engine. We've also imported sessionmaker and have passed to it our engine as a parameter. Remember that the first step was to create the engine and the second one to bind our metadata and sessions to the engine? Now we can create sessions calling our Session, which will take one of the persistent connections from the pool so that we can make queries. If on that session we execute the close() method we will return the connection to the pool. For example:

In [12]: session = Session()
In [13]: session.query(...).all()
In [14]: session.close()

We're still missing the mapper, so let's take a look to what the docstring says about it:

In [15]: mapper?
Type:             function
Base Class:       <type 'function'>
String Form:   <function mapper at 0xa3cabfc>
Namespace:        Interactive
File:             /home/edvm/.venvs/pet/....
Definition:       mapper(class_, local_table=None, *args, **params)
Docstring:
    Return a new :class:`~.Mapper` object.

    **:param class\_: The class to be mapped.**

    **:param local_table: The table to which the class is mapped, or None if
       this mapper inherits from another mapper using concrete table
       inheritance.**

The mapper's documentation tells us that it takes two parameters, the first one being a class and the second one a local_table. For those of you who watched Dragon Ball, understanding this will be very simple... mapper is like the fussion between Goku and Vegeta to form Vegito (http://dragonball.wikia.com/wiki/Vegito) or Gogeta (http://dragonball.wikia.com/wiki/Gogeta). You put one Goku, one Vegeta and you get one Vegito/Gogeta. Easy, right?

../images/pycamp_orm/gokuvegeta.jpg

Ok, for those of you who didn't watch Dragon Ball, this would be like making an orange juice: you put a juice extractor, some oranges and you get an orange juice. You see how easy was this mapper thing? ;)

To continue with the explanation, local_tables are the tables on our DB and the class is a custom class that will be binded to the table in our DB, and what we'll get as a result of passing to mapper our class and our local_table is: a Mapper object. Let's try to see this with an example:

In [1]: from sqlalchemy import create_engine
In [2]: url = 'mysql://grids:grids@localhost/grids'
In [3]: engine = create_engine(url)
In [4]: from sqlalchemy import MetaData
In [5]: meta = MetaData(bind=engine, reflect=true)
In [6]: meta.tables.keys()
Out[6]:
[u'django_admin_log',
 u'auth_permission',
 u'auth_group',
In [7]: type(meta.tables['django_admin_log'])
Out[7]: <class 'sqlalchemy.schema.Table'>

In this example we have connected to a database of a Django project, and let's pay attention to lines [5] and [7].

On line [5] we defined our metadata passing as parameters our engine and reflect=True which makes SQLAlchemy become a Super Saiayin (NOT an orange juice) and connect to our database magically discovering all of our tables and adding them to a dictionary inside meta.tables where all the keys are the tables' names and the values are objects of the type sqlalchemy.schema.Table.

Now we have the local_tables of our database and what we need is to create a class for every table, so that we can have the two parameters needed for the mapper. So let's get into the loop:

In [11]: class DB(object):
   ....:     """
   ....:     Dummy DB Object to store stuff
   ....:     """
   ....:     pass
   ....:
In [13]: db = DB()
In [17]: from sqlalchemy.orm import mapper
In [18]: for tablename in meta.tables.keys():
   ....:     obj = type(str(tablename), (object,), {})
   ....:     setattr(db, tablename, obj)
   ....:     mapper(obj, meta.tables[tablename]) # we pass the class and the local_table
Out[18]: <Mapper at 0xaa4d76c; django_admin_log>
Out[18]: <Mapper at 0xaa4db6c; auth_permission>
Out[18]: <Mapper at 0xaa524ec; auth_group>
Out[18]: <Mapper at 0xaa529ec; auth_group_permissions>
In [19]: db.auth_user
Out[19]: <class '__main__.auth_user'>

In the previous lines we created a DB class which will be used to store the mapped tables of our database. On [13] we instantiate DB, then on [17] we import mapper and on [18] we loop through every element of the dictionary meta.tables. This is, we're looping through every table that was auto-magically discovered by MetaData's reflect, creating on the fly a new type using the type method, assigning to it the name of the table. Then, using setattr we put the recently created object in our instance db and, finally, we call mapper passing to it the object we created with type as the first parameter and the object of type sqlalchemy.schema.Table (that is, our local_table) as the second parameter.

To finish this part we can see on [19] as an attribute of db our already mapped table auth_user.

And that's not it, we could also do a query like this:

In [28]: from sqlalchemy.orm import sessionmaker
In [29]: Session = sessionmaker(bind=engine)
In [30]: session = Session()
In [36]: qobj = session.query(db.auth_user).first()
In [37]: qobj.username
Out[37]: 'admin'
In [38]: qobj.password
Out[38]: 'sha1$04a19$2559e5f16eb58cab606c18443b552831748187ac0'
In [39]: session.close()

What about pycamp.orm?

Well, pycamp.orm are about 140 lines that at the time of writing this article I realized that they could be even less.

It does exaclty what it's shown in this article (hehehe). So you already know how pycamp.orm works ;)

This module was born on the PyCamp that was hosted in La Falda, Córdoba, Argentina; which was a camp that gathered Python programmers.

It was my first PyCamp and I had a great time, the people were really cool. Besides it's really good meeting in person with people that you chat with on IRC, meeting new people, playing role-playing games and being constantly treated as either a mutant or a communist (on the role-playing game, I mean ;)), losing terribly on foosball championships, learning how to juggle and, yes!, coding stuff that YOU LIKE!. So, let's see how to use the current version of pycamp.orm:

(pet)edvm@Yui:~$ hg clone https://edvm@bitbucket.org/edvm/pycamp.orm
(pet)edvm@Yui:~$ cd pycamp.orm
(pet)edvm@Yui:~/pycamp.orm$ ls
README  buildout  pycamp  setup.py
(pet)edvm@Yui:~/pycamp.orm$ python setup.py install
...
...
(pet)edvm@Yui:~/pycamp.orm$ ipython
In [1]: from pycamp.orm.mapper import Database
In [2]: from pycamp.orm.mapper import DatabaseManager
In [3]: mydb = Database('grids', user='grids', passwd='grids', engine='mysql')
In [4]: manager = DatabaseManager()
In [5]: manager.add(mydb)
In [6]: auth_user = manager.mysql.grids.auth_user
In [7]: sesion = manager.mysql.grids.session()
In [8]: qobj = sesion.query(auth_user).first()
In [9]: qobj.username
Out[9]: 'admin'
In [10]: qobj.password
Out[10]: 'sha1$04a19$2559e5f16eb58cab606c18443b552831748187ac0'
In [11]: sesion.close()

Let's explain a little what pycamp.orm does: From mapper we import a Database which is where we set the username, password, database engine, etc. All the data that we store on Database is what will be used with the method create_engine of SQLAlchemy ;). In [4] we create a DatabaseManager which is a BORG (http://code.activestate.com/recipes/66531-singleton-we-dont-need-no-stinkin-singleton-the-bo/) that has an add()``method that takes a ``Database as a parameter from which it takes the information to create the metadata; it calls the sessionmaker, the mapper and leaves everything ready to start querying the DB :).

Well, that's it. To finish this article:

The code is here:

https://bitbucket.org/edvm/pycamp.orm/src/385aeb2f6e12/pycamp/orm/mapper.py

You can see the video of this article which is a little different (and in Spanish) here:

http://python.org.ar/pyar/PycampORM

You can download the slides of the talk here (also in Spanish):

http://xip.piluex.com/PYCAMP_ORM.pdf

And that's it, I hope you liked it!

Help PET: Donate

blog comments powered by Disqus

Last Change: Thu Sep 22 08:54:01 2011.  -  This magazine is under a Creative Commons license