Quantcast

Ruby on Rails, Io, Lisp, JavaScript, Dynamic Languages, Prototype-based programming and more...

Technoblog reader special: $10 off web hosting by FatCow!

Monday, June 25, 2007

Advanced Concepts in Ruby on Rails Hosting Part II

Last week, we were discussing the analogy of serving websites being similar to running a translation company. A request would come in as a document, be handed to an application server as a translator, and returned to the client. We left off with a scenario of three translation offices with 10 translators in each office. One of the simplest methods to distribute work among these translators is to hand out documents one at a time in a round-robin way. However, due to inherit traits of certain documents being longer than others and certain translators being faster than others, backups build up for some of the translators, leading to a random lag and customer complaints.

Rather than the brute force method of adding more offices and translators, can you think of a better way to distribute resources?

The bottleneck in the scenario sketched above is management. Our translation company still has only one manager, thus limiting his ability to distribute resources more effectively. If we hire office managers and let the manager hand documents to the office managers, this lets us think of more interesting distribution techniques. For example, instead of overwhelming our translators with a growing pile of documents, and thus a growing pile of responsibilities, the office manager can wait until each translator has finished their job before handing them a new document.

Let us think about the consequences of this change. First some assumptions. Assume John is faster at translating than Susie because he has less on his mind (in computer lingo this would mean that Susie is experiencing a memory leak, possibly due to a bad programming library). Further assume a pile of documents comes in with this order: a 10 page document, a 2 pager, a 20 pager, a 1 pager, a 3 pager. In our original setup we could easily find ourselves in the situation where Susie gets a pile with the 10 pager, the 20 pager, and finally the 3 pager; whereas John only got the 2 pager and 1 pager. You can see that Susie's 3 pager should have been easy and fast, but was stuck behind a few bigger documents and is in the hands of the slower translator.

With the new distribution algorithm, the worst case scenario would be that Susie would be chugging away at the 20 pager, but since John quickly made chump change of the other documents, he can turn over the the 3 pager before Susie even finishes the 20 pager. This is much more streamlined because the queue was processed as quickly as the resources freed themselves up as a group, not relying on the individual translator to handle the concurrency.

The typical Rails setup of a reverse proxy handing requests to mongrel is not the most efficient use of the resources, so I built a load balancer I call drproxy which sits between the reverse proxy and the Rails dispatchers and queues up requests, handing them out in a more efficient way as each resource is freed. Furthermore, I build drproxy in Erlang, a language built from the ground up to excel at concurrency. Ruby is a slug when it comes to handling concurrency and multi-threaded environments. Erlang is like a Porsche.

There are, however, even more ways to make the system more efficient in an algorithmic way. Think about it for a while and I will tell you what I did next week.

Technoblog reader special: $10 off web hosting by FatCow!

6 Comments:

Blogger Sam said...

Could you open source the erlang program you wrote?

12:00 PM, June 25, 2007

 
Blogger Lucas Carlson said...

Yes, I am just packaging it up, should be released soon.

12:01 PM, June 25, 2007

 
Anonymous Gregg Pollack said...

Isn't this what apache proxy load balancer is for?

If no, how is apache load balancer different then what you've created?

9:03 AM, July 04, 2007

 
Blogger Lucas Carlson said...

Please read the third installment. Apache's load balancer handles requests the same way a standard grocery store checkout works... a request goes into a line for one dispatcher, it doesn't matter if another dispatcher frees up before the assigned dispatcher is able to complete that request.

10:30 AM, July 04, 2007

 
Blogger Mark Imbriaco said...

I'm late to the party, but I was skimming your blog and noticed your writeup about drproxy. HAProxy supports the same idea of queueing up requests and passing them along to the next available downstream process. You're right that it makes a big difference not having to "get in line" behind slow requests.

4:45 PM, October 16, 2007

 
Anonymous Roderick van Domburg said...

At RailsCluster we're doing hardware-based load balancing based on least number of connections. Bare metal performance!

9:31 AM, January 18, 2009

 

Post a Comment

<< Home