In case you’ve never heard of Hadoop (or if you have but never quite worked out what it is), in simple terms it’s the framework used to process multiple data sets in batches across numerous servers. It originated from Google’s own MapReduce data tool, which still powers it, but has since grown into a much bigger open-source project under the non-profit Apache Software Foundation, with various vendors selling their own versions.
But now Google is implying that Hadoop has passed its use-by date. At the company’s I/O conference last week, senior vice-president of technical infrastructure Urs Hölzle said: “We don’t really use MapReduce anymore”.
He used the keynote speech to introduce Google Cloud Dataflow, which, it is claimed, is better at analysing data on the fly rather than in batches, producing real-time rather than retrospective insights. To illustrate the practical implications a demonstration showed how, by processing around 400 new tweets each second (which is apparently more impressive than it sounds), Cloud Dataflow could chart sentiment towards the Brazilian football team dropping after the dubiously awarded penalty in their first World Cup game against Croatia.
However (there’s always a ‘however’ in ‘big data’), as if to contradict its own shift away from the existing big data standards, Google Capital then promptly led a new $110m (£64m) funding round for technology vendor MapR, which builds Hadoop systems for companies. That would tend to suggest a commitment to the status quo.
Google’s intentions are, as ever, a mystery to anyone but itself. It could be that the Cloud Dataflow launch is an attempt to phase out open-source Hadoop in favour of something Google can retain greater control over – big data didn’t seem such a big business opportunity when it let MapReduce out into the world. Or it could simply be that its new offering will be a complement to what already exists, ‘with added real-time’.
It’s at times like this that it becomes apparent why marketers and IT professionals have such a symbiotic relationship today. As a marketer, you’ll know just how valuable it could be to perform automated real-time analysis on the data you use, while your IT colleagues will be the ones to tell you how much it’s going to cost and whether the technology is worth the hype.
Is yours the kind of brand that would benefit from seeing, up to the second, how people are feeling and timing your messages to take advantage? If so, you should at least be aware of what’s becoming possible.