Featured Article

MapReduce - Benefits, Confusion and Controversy?


By Colin White, BI Research

MapReduce has gained significant interest and momentum in the IT industry, while also managing at the same time to generate a certain amount of controversy. The interest is due to its success in solving large data problems in highly visible Internet companies such as FaceBook, Google and MySpace, and the potential to expand this success into more traditional business applications. The controversy has been caused by certain pundits predicting that MapReduce will lead eventually to the demise of SQL-based relational database management systems (RDBMS).

While there is no question that MapReduce can offer significant benefits for handling and analyzing large amounts of data, it is important to understand the types of applications that MapReduce is best suited for, and equally important, the ones it isn’t. It is also important to realize that there are different approaches to implementing MapReduce solutions. Some use related software systems like Hadoop, while others are integrated into existing products such as an RDBMS.

Although much has been written about MapReduce, there is still considerable confusion and misunderstanding about MapReduce technology and its use. In most cases, MapReduce applications will work in conjunction with existing software solutions, rather than replace them. This powerful hybrid approach offers significant potential for managing the costs of handling the huge and ever growing information mountain that exists in many organizations.

Because of the confusion surrounding MapReduce, I am very pleased that Aster Data has taken the initiative to sponsor this independent forum on All Things MapReduce. Like several enlightened software companies before them, they realize that for a new technology to gain traction in the market it is important to first educate the market about its use.

As with all new technologies, MapReduce is often presented as the panacea to everyone’s problems. In this case, the problem is processing and analyzing large amounts of data in a cost-effective manner. In reality, MapReduce is another tool in the data management toolbox. The secret to success with MapReduce is to understand its secret sauce, and when and how to apply it. I am looking forward to MapReduce.org helping in this task.