Help

I stumbled upon an interesting post by David J. DeWitt and Michael Stonebraker entitled MapReduce: A major step backwards.

For those not familiar with MapReduce, it is a programming model developed by Google for distributed computing on extremely large sets of data. You can read Google's original paper outlining the technique here.

The authors of this post make some excellent points, most of which are centered upon the importance of a well defined structure and abstraction to data. While I certainly agree to a number of them, it's hard to ignore the simplicity and effectiveness of MapReduce. After all, it is currently being used to process 20 petabytes of data a day.

2 comments:
 
25. Jan 2008, 04:02 CET | Link

What's puzzling in their article is that they think people equates RDBMS and MapReduce. Of course, for people who know MapReduce, it's just a different tool available to stock data.

Good reminder though for people believing we have finally found the Next Big Thing and that 2008 is the RDBMS death year :)

25. Jan 2008, 04:08 CET | Link
Chris Bredesen | cbredesen(AT)redhat.com

Here's a response I read a day or so ago from someone inside Google:

Databases are hammers; MapReduce is a screwdriver.

Different problems, different tools.

Post Comment
Name:
E-mail address (optional):
Homepage URL (optional):
Subject:
Help
Let me type some plain text, not markup
Enable live preview
Enter characters (ignore circles):