A note on Big Data

It’s clear that there has been a huge hype about big data these days. There are thousands of articles written on Big Data. Tons of money are invested into Big Data startups and technologies. Etc etc. I’m sure you can see it all by simply googling big data or following ‘big data’ on VB.

Over the summer, after first working with some “big data”, I too, was thrilled by the idea of being able to store and analyze all the imaginable data there is. But after following and learning more about the topic for a while, I realized, big data is no different from just, data.

Big data is nothing more than, regular data, except a lot of it. The only problem introduced is not enough time. We can’t wait for days before making an investment decision, or customer recommendation  The obvious solution is to analyze faster. And right now, the solution most people are implementing is MapReduce, Hadoop, etc, etc. So obviously, in the field of research, there will always be work to be done to speed up the process, to find faster algo and better database, methods or hardware.

But there is so much more to data than new tools and technologies. There is what we can do with them. The answer to that question, has not really changed due to the rise of big data. We will need to continue find value in data like we used to, and analyze data like we used to, on top of new technologies.

My point: companies like IBM and Oracle, and Apache could continue their work on supporting big data technologies and all that, but others need to do nothing more than upgrading their technology (by learning and hiring people with similar experiences), and continue finding problems and solving problems the way we’ve been doing, with the same ML algos or math models.

What are your thoughts?


