How I processed a log file 20x SLOWER than before
Sometimes you have to sit back and laugh at yourself. Having recently written Starfish to help speed up slow tasks, I tried to find as many uses as I could for it. At MOG we have to parse huge log files so I thought I would be clever and try to use Starfish for the task. After running it for a while, I looked at the stats only to find that it had been processing my file 20x slower than it would have without distributing it. At first I was puzzled, until I realized a very important thing about distributing processes. You have to make sure that the task you distribute takes longer than the distribution process.
if overhead_time > processing_time then puts "Don't use Starfish" endIt turns out that I could process 10,000 lines of the log in about a second... so to send each one of those lines over the network to have them processed was just silly. Even sending 10,000 lines at a time is relatively unnecessary.
I share this story so that you might not make the same mistake that I did. However I realized that Starfish can know when the overhead makes it not worth the trouble, I can actually warn people using Starfish when it is and is not a good use of resources. I will be adding this to the next release which shall come out shortly.

4 Comments:
We frequently sit back and laugh at you, you may as well enjoy it too!
6:22 PM, August 28, 2006
Hehe, mistakes are fun to get cool idea's from.
5:55 PM, August 29, 2006
Starfish makes distributed programming more accessible, but there is a moral here.
Some things create inherent complexity, and just because code can hide that doesn't mean reality follows suit. For example, people like to assume that Quicksort is the "fastest sort algorithm", and they apply it like a club. But it's really much slower for lists that tend to be mostly in order.
12:08 PM, August 30, 2006
It great you have the guts to laugh at yourself, share it and move on. We are all learning here. Keep contributing!
anonymous: don't be an ass
1:23 PM, July 12, 2007
Post a Comment
<< Home