I am still seeing lots of posts / blogs / comments at the moment where it is apparent that the term “Big Data” is either not fully understood or is being miss-used simply to gain peoples attention. Which is of course exactly what I have done here.
The term “Big Data” appears to be becoming synonymous with Business Intelligence (BI) and Data Analytics. I can sort of see why given what “Big Data” actually refers to but we’ll come to that.
But BI and Data Analytics existed long before the term “Big Data” sprang into existence. There have always been those clever people who can crunch through data and identify patterns, clusters, trends and develop propensity models and such based on customer behaviour. “Big Data” has not changed anything here. Such statistical techniques were and still are very important to businesses striving to find a competitive edge.
And “Big Data” does not really have anything to do with the volume of data being held by organisations. Admittedly the growth of digital channels, E-Commerce and social media means that there is far more data available than ever before. But data has always existed in volumes large enough to be problematic.
One of the problems facing some marketers today is that they have so much data available, and they have the statistical techniques available to gain genuinely useful intelligence from it, but they want the answer to their question now – not in 24 hours time. This is where the term “Big Data” arose.
Because Big Data really refers to a technology that is capable of running complex processes or queries against large volumes of data much more quickly than ever before. I’m sure the technicalities are very complex but the principal is quite straightforward. Here is an example.
If I have one big data set of 24M rows, and one computer it may take me 24 hours to query my data.
If I break my data into 24 chunks of 1M rows each, and I get 24 different computers to query each chunk at the same time, I can query my data in 1 hour.
Think of it as a 4x100m relay race where each team member starts at the same time rather than one after the other. I know that analogy doesn’t quite work as if they all started at the same time they would never be able to pass the baton. But the point is they would cover the same distance in around a quarter of the time. They are running at the same time (in parallel) rather than one after the other (in series).
So do you need “Big Data”? Well it depends on how quickly you need answers to questions. If you are trying to influence on-line shoppers behaviour in real time, say by offering them a targeted offer based on their behaviour then maybe you do. If you want to segment your customers in preparation for a campaign then this may be less time critical so the answer is no.
So “Big Data” won’t solve your data quality problems, it wont fix all your silos of data, it won’t give you a Single Customer View, it won’t provide you with Business Intelligence, it won’t by itself let you get ahead of your competition. It will enable you to get the answer to questions quicker – if that is what you need.