The scientific method that will prick your ‘big data’ bubble

I wish I’d seen Sir John Hegarty speak at the Advertising Week Europe event last month. The topic was ‘big data’, and the celebrated elder statesman of advertising creativity didn’t hold back in his attack on the industry’s latest technological wonder weapon.

/p/b/k/richard_madden.jpg

As Marketing Week reported Hegarty warned marketers that by focusing too much on data and new technology, they risked not seeing what is actually going on around them: “Supermarkets have an incredible amount of data coming in to them and they didn’t realise they were flogging horsemeat to people.”

He continued: “I think there will be a huge backlash and people will say ‘That’s not the world I want to live in’. To brands that say ‘I understand you’, I say ‘Fuck off, you don’t understand me. Mind your own business. I don’t want to be understood by you.”

It must have been a bravura performance, a little like Peter Finch’s possessed anchorman in Sidney Lumet’s Network. And judging by the column inches it received in the industry press, it struck something of a chord. I’m not surprised.

As someone who’s been working in and around the stuff since God was a junior account executive, I’m mildly amused by the sudden interest everyone in the communications business has in the topic of big data.

It reminds me of the time when marketers and their agencies used to show they ‘got’ digital by supporting every TV campaign with a dedicated microsite and putting a ‘making of’ film on YouTube.

However, amusement turns to mild irritation when I hear some of the claims that are being made for big data by people who should really know better. Some academics have even gone so far as to say that big data spells ‘the end of theory’, a fatuous assertion probably last made by Brezhnev’s chief economic adviser.

Others seem to think that all you have to do is boil up all your transactional, behavioural and social data together in some kind of computational pressure cooker and somehow amazing marketing truths will spontaneously emerge.

This, of course, is very good news for people who make computational pressure cookers, namely the big hardware, systems and software consultancies. According to The Economist, the global big data industry is worth $100bn and is growing twice as fast as the software business as a whole.

Before you fall prey to the irresistible urge to join this analytical arms race, I’d like you to pause and consider just three things.

The first comes from the world of science. Not marketing science, but the real kind. Astronomers, geneticists and particle physicists are faced with the challenge of interpreting big data all the time.

For example, what do the physicists working on the Large Hadron Collider do with the vast amounts of data it produces every day? They do what researchers and direct marketers have been doing for years. They take small but representative samples of it to study. Because they know they don’t have to eat the whole elephant to know the meat is tough.

Get your sampling techniques right and you should be able to do big data analytics without using so much computing power that the lights go dim in the rest of your postcode area.

But what should you be analysing, exactly? Listen to some big data evangelists and they’ll tell you the answer is everything. But half a millennium of the scientific method disagrees with them. Which brings me to my second point.

If you really want to do this science thing right, you need to start with a problem and a hypothesis. Which is just a posh word for ‘hunch’.

Having hunches is part and parcel of using data properly. Big data is a hard enough haystack in which to go looking for a needle. Having no clue about what a needle might look like when you find it makes the task utterly impossible.

The ultimate test of a hypothesis which the scientific method sets is repeatability. In the world of marketing, it’s accepted wisdom that past behaviour is the most predictive indicator of a consumer’s future actions. But that assumes every other variable in the experiment will stay the same. And our real world of irrational markets is rarely that co-operative.

Enough philosophy. For the third of my three points, I’d like to return to the realm of practical business, or at least the pages of the Harvard Business Review. Last year it ran a survey of 5,000 employees in 22 global companies. The aim was to assess the ability of global businesses to harness data insight.

The results were worrying. Only 38 per cent of employees surveyed were assessed to have the skills and the temperament to use data effectively. The rest relied on personal judgement or, just as alarmingly, fell into a segment labelled ‘unquestioning empiricists’. Interestingly, the functions staffed by the lucky 38 per cent were determined to be 24 per cent more effective across a range of metrics, including market share growth.

The other findings were equally revealing. Just as computers were once maintained exclusively by a priestly caste of white-coated acolytes, so data analysis is now the preserve of its own clannish and highly introspective elite.

With the disturbing result that in this open-system, wiki-enabled world of ubiquitous data, only 44 per cent of workers claimed to know where to find the information they need for their daily work.

Perhaps Hegarty’s supermarket meat inspector was one of the 56 per cent.

Recommended