Big Data: Nuggets of Gold Or Hard Nuts To Crack?
When reading about big data there is a notion of big data tools supporting unstructured data, that is data which has a very loose format such as a spread sheet or electronic mail where the contents could literally be anything, however dig deeper and there is little in the way of detailed information on how to realistically approach this unstructured data in order to get the most from it. In the World Economic Forum report, “Big Data, Big Impact: New Possibilities for International Development”, the amount of data generated online exceeds 2.5 quintillion bytes every day and is expected to grow radically so the associated challenges for mining this new gold are likely to increase.
In a recent InfoWorld article David Linthicum made a good point, “The answer to how to best do big data is the classic consultant’s response: It depends on what you’re trying to do.” as this highlights the “missing link”, as it were, between the ability to process vast amounts of data and the big insights deep analysis could lead to.
The real problem is that few people really know where those big insights could be and as such it is difficult to put a finger on exactly what you want big data analytics to do. Mike Gualtieri, in the Computerworld UK article “What’s your Big Data score?”, suggests that the theory behind big data is relative as he puts its “One organisation’s Big Data is another organisation’s peanut.” and we all know some nuts can be hard to crack.
For example you may anticipate finding some operational savings, efficiency gains, or competitive advantage but faced with terabytes, or more, of loosely connected information new ways of thinking are required in order to make sense of potentially new links between information sources. Later this year the Knowledge Discovery and Data Mining Conference, KDD 2012, will be held in Beijing bringing together data scientists from across the globe to present, and discuss, advancements in data science, big data, and analytics with the aim of addressing the issues behind big data science.