Title Match of the Century: Speed of Development vs. Speed of Computation
It is entertaining to read the comments people have about Starfish over on Reddit:
If your task is intensive enough to warrant parallelization, it is intensive enough to warrant investigating faster languages. Ruby is good for a lot of things, but if my choice is between throwing more processors at the problem or finding a better solution I will go for the better solution every time.
Interesting point, not that he is right, but what he omits gives me pause. The vast majority of the comments where people talking about Ruby being 1000 times slower than their language, but they give no consideration at all to the most striking aspect of Starfish (in my opinion): I can do relatively advanced distributed programming in 6 lines of code.
I'll say it again because it is important. 6 lines of code.
With hardly more than a flick of my wrist, I can parallelize a task and get performance gains of 10, 20, 30 times, whatever I need. In less than a minute, I can write code that will go through a 10Gb log file grepping for a string and parsing that information, collecting that information and wait for new lines to process on demand, in a distributed system that can work over N machines.
I have written much simpler processes in faster languages like C and it takes me hours and hours, not only for writing the hundreds of lines of code but for debugging the darn thing. If I was tasked with creating a distributed log parser in C that did something non-trivial with each line of the log, it could take me a week and it still wouldn't be right.
I work at startups. I don't work for banks, I don't work for Microsoft, I don't work for enterprise. Can I, as a head programmer at a startup, afford over 80 hours of my time writing a log parser in C because it could be 1000 times faster? Not if my startup wants to succeed. Can Microsoft afford to have one of its tens of thousands of programmers spend the time to do that? Of course.
Starfish can and does, on a daily basis, parallelize and speed up what would have otherwise been a slower process. It does so with almost no code.
A few minutes and N times faster than a regular ruby script, or a few weeks and N times faster than a regular ruby script. I know which I choose, and it works extremely well for us. I am always a fan of the right tool for the right purpose. I know that Starfish is not always the right tool, but it is amazing how quickly people discount a tool without considering all of the issues involved. It is not always about processing power. Man hours saved can be much more valuable than a few extra orders of magnitude in processor power.
You should follow me on twitter here.
Technoblog reader special: click here to get $10 off web hosting by FatCow!