Thursday, October 18, 2007

Avoid locking with JMS

One of the hardest things to do in a high-load system is reporting. Sometimes the amount of data you have to keep track of is too big.

Example:
I have a page that is going to be displayed, and need to :
  1. Log every view of the page individually.
  2. Keep a running total of everytime a page is viewed.
  3. Have a max number of times a page can be viewed, and when that has been reached stop showing the page.
Now lets assume that we need to keep all these values in a database so we can scale-up our app server. For requirements 2 & 3 we need to keep an aggregate table with a column that represents a "counter". Without this table we would have to do an aggregation query on the log table (1) to see how many times this has been viewed, as our data grows this will get slow.

So we create an aggregate table table that contains the page name and a counter of how many times that page has been displayed. Every time that page is viewed we insert into the log table and update the counter in the aggregate table.

Locking is a b$$ch

There is a fundamental flaw with the above solution, if I get 100 concurrent page views, the database will have all those page views "lock" on that row in the aggregation table, this is so the db can provide ACID. This will significantly decrease our ability to scale. You could lower your transaction isolation in your db, but you aggregate value consistency may suffer.

JMS to the rescue

A good solution is to use JMS. When a page request comes in we still insert a row into the log table, but with the same XA transaction we also publish a message onto a durable JMS destination. No locking will occur. Now have a consumer (ether a message-driven-bean or a plain jms message consumer) will consume those messages with a durable subscription and update the aggregate asynchronously. Who cares if they lock, it will be on a completely different thread thats not affecting the performance of serving the page.

Obviously its possible to go over your max show variable (race condition that a page is served before the message consumer has a chance to process and update the aggregate) but when you get to high-yield systems you have to start playing payoffs.

So the moral of the story: JMS can play a big role in high-yield systems with its durability and async processing. Next time you have a locking issue or find you are doing too many things during a request, try moving those things behind a durable JMS topic, it may save you some time!

No comments: