Thursday, October 11, 2007

Python and SQLAlchemy

So if you read this blog its no secret I'm a fan of Hibernate. Not because I'm a Java evangelists but really because I have yet to see many of Hibernates features in other ORMs. Hibernate has gone above and beyond to make an ORM that is not only focused on pojo-based database access but also scalability and durability.

You can tune hibernate to do association fetching and second-level caching that meets your load needs and reduce the amount of bottleneck your database is. You can do projection to limit query context. You can also enlist hibernate in java's JEE global transaction to make it durable and ACID compliant. Many of the other languages and ORMs seem to focus more on developer convenience than these things.

In the python world my boss pointed me to SQLAlchemy. Its a python ORM that is based highly on the design of Hibernate. I like what I see, the association mapping and querying is very similar and powerful.

One thing I see that is missing is the powerful second-level caching concept. Post people's response to this is "Oh, use memcache." Well sure but you loose the acidic properties of your database and ORM. Developers are tasked with first checking in memcache for their entitys, then going to the db, and the same with saving. Both of which don't partipicate in the database transaction they are working in. However, after looking at SQLAlchemy a little more I wonder how hard it would be to build transparent transactional enlistment of it with memcache.

Just think, you decorate your SQLAlchemy object with some sort of marker saying that you want it to participate in second-level caching. Then, along with setting up your persistence context (DB access, etc) you also setup a memcache resource. Now when you access and save your SQLAchemy entities it also pushes and pulls them out of memcache behind the scenes and honors the database ACID transactions. This could do wonders for the scalability of python.

This smells like an open source project to me! Comment with ideas and directions.

1 comment:

Anonymous said...

I'd love to see builtin 2nd lvl cache into SqlAlchemy. It'd save so much developer's time and provide faster response times of the server. Without that, currently I have to worry about that, I can't simply iterate over associated collections of objects; all the same-unchanged objects are loaded all the time; I must write SQL queries to speed up :(
Hibernate rocks in this area.
Good luck to SQLA...