Attribution is broken, here’s how to fix it

Marketing attribution nearly always excludes inconvenient data or pointlessly analyses events you can’t influence, but with long-term goals and the right audience insights it could work much better.

AttributionAmid all of the current brand-safety brouhaha, it might be hard to remember that it was Facebook who ushered in the current season of scepticism by admitting that their metrics significantly inflated the reported reach of media on their platform.

The response of Facebook’s EMEA head of marketing science Tony Evans was curiously unhelpful suggesting clients are more concerned with measurement than with metrics. Well, that’s alright then. Only in marketing could “measurement” be branded as distinct from “metrics”. Given that “metrics” is defined as “a system or standard of measurement”,  this is really just arguing semantics, but never mind – language is the least of our issues.

The objective of attribution must be to increase the contribution to sales from digital media through improved planning and buying. It should produce demonstrable improvements in performance and should not just be reportage for its own sake.

In reality, however, many of the available solutions are stalked by at least one of the four horsemen of the attribution apocalypse: incompleteness, absurdity, futility, and randomness.

1. Incompleteness

A flawed assumption undermines many attribution solutions from the start: that digital media is responsible for a certain number of conversions and that the task in hand is simply to divide the credit among the various parties involved. This is a classic example of the McNamara fallacy, named after the US secretary of defence in the 1960s, Robert McNamara, and described by social scientist Daniel Yankelovich:

“The first step is to measure whatever can be easily measured. This is OK as far as it goes. The second step is to disregard that which can’t be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily really isn’t important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist. This is suicide.”

To understand an individual’s actions requires much more knowledge about them than is available through just their online behaviour.

The data left in the wake of a digital campaign may well be vast but it is also hugely limited in scope. Few attribution solutions consider economic or environmental variables, pricing information, distribution, or even offline media. To understand an individual’s actions requires much more knowledge about them than is available through just their online behaviour. Because of the near-impossibility of obtaining this information it is easier to regard it as non-existent.

As a result, any conclusions are questionable. The best that can be said for this type of analysis is that it is better than last-click attribution.

READ MORE: Bob Wootton – My practical guide to getting the most out of media spend

Problems also arise when some or all of the non-converters are simply excluded from the analysis to cut down the data to be processed. Given that non-converters typically represent 99% of the total exposed population, this is information loss on a mammoth scale and is a textbook example of ‘survivorship bias’, where we focus only on results that pass some arbitrary selection process.

It is to be hoped that this will be less of a problem in future as data storage and processing become cheaper and faster, but it is a serious flaw in many current solutions.

2. Absurdity

In anthropology, the term ‘magical thinking’ refers to the association made between rituals or sacrifices and real-world events despite there being no conceivable physical mechanism which connects the two. It seems like an idea that belongs to a more primitive age yet it is quite apt for describing the belief that exposing a person to one pixel on a screen for one second will affect their brand awareness.

[Relaxing viewability rules is like] being unable to buy enough champagne and, to make up the volume, being offered dog urine for the same price.

Without knowing the number of people who can possibly have been influenced, it is impossible to judge whether an estimate of advertising effectiveness is realistic or not. Contrary to Facebook’s assertion, metrics cannot be arbitrarily separated from measurement. To be reliable, the only events that should be reported are those that could conceivably have produced a result. Viewability matters – how can it not? – and to pretend otherwise is either self-delusion or an attempt to delude others.

It is sometimes argued that it’s not possible to buy sufficient inventory unless the viewability constraint is relaxed. This is the equivalent of being unable to buy enough champagne but, to make up the volume, being offered dog urine for the same price. If the desired product is sold out, either take your money elsewhere or go further upmarket.

3. Futility

One of the most pernicious concepts in digital measurement is the user journey. Its origins seem to lie with the marketing funnel, the mystical means whereby consumers progress from a state of ignorance through to purchase by way of awareness and consideration. This can be a useful vehicle for framing a communications task across a population but it’s not really something that can be sensibly applied to an individual.

Unfortunately, digital data is seductive and the unwary can be lured onto the rocks of believing that tracking an individual (or, at least, a cookie) over a long period while recording each brand exposure and activity is deeply meaningful.

You are powerless to make people follow your ‘ideal’ path.

Imagine getting a report stating that the ideal consumer path for your product had been identified. The person had to go into Tesco, browse the appropriate aisle but not purchase, walk out of the store, see an advert on a digital screen, go back into Tesco a week later, leave, go to Sainsbury’s, make an enquiry of a shop assistant, leave, go home, see a TV advert and finally go back to Tesco the following week. Of the 18 people who followed this path, two ended up buying the product, a much higher conversion rate than people who followed other paths.

Besides applying for a restraining order, you might ask what you are expected to do with this information – frog-march people from store to billboard to store again?

The critical fault is that you are powerless to make people follow your ‘ideal’ path. You can do things to make a particular path more likely, such as retargeting, but in doing so you make many other paths more likely as well. There are hundreds of thousands of possible paths, some taken, some not. The ones that are travelled are effectively chosen at random, subject to any number of other influences. Analysis of this type produces meaningless results that cannot be effectively acted upon. It is, in a word, pointless.

4. Randomness

People are surprisingly predictable in terms of their individual behaviour and, like gas molecules, even more so en masse. In the short term, there is no significant variance in an individual’s web-browsing habits, their likes or dislikes, their shopping practices, or their mode of visual perception.

So why do so many measurement solutions give wildly varying results each time they are used? How can it be that a publisher has just the right sort of people on their site one month but not the next? Why does one demographic group change from being highly enthusiastic to bafflingly unresponsive in the space of a week? Why would the optimum size or placement of the same advertisement be different from yesterday?

The problem is that if you create and test enough variables then some will always correlate out of sheer chance.

There are two possibilities; either people are far more variable in their behaviour than anyone has ever noticed or the measurement process is introducing significant randomness. The latter is far more likely.

Attribution methods based on regression analysis of individual conversions are especially susceptible to randomness. Typically the various combinations of publisher, creative, format, size, etc are split out into separate variables, which are then tested in turn against a conversion variable of some nature, eg clicks or purchases. The most statistically significant variables are identified and then given partial credit for the conversion.

The problem is that if you create and test enough variables then some will always correlate out of sheer chance. Statisticians sometimes refer to this practice as data dredging. Without a specific hypothesis underpinning the finding, any correlation is likely to be spurious. Because these correlations are unlikely to recur, the subsequent attribution output will be completely different. Results created in this way are no better than employing a monkey to throw darts at a media plan.

Variability is to be expected in any complex system. There are many reasons why on a given day a bid price might be higher, or a site got an unusual spike in traffic. But the point is that these are random events – they can’t be predicted or reliably reporduced. Attaching significance to these random events and reallocating budget on that basis might not be just ineffective, it could actually be detrimental.

Is there a solution?

So, how can clients and agencies improve their digital media measurement?

The first step is to acknowledge the limitations of their current methodology. Attribution can still be useful provided that its limitations, and the implications of those limitations, are well understood.

If you are paying for 10 million impressions then you should be able to see the effect on total sales or other conversion metrics. Chasing individuals around the internet produces more data than can be managed, yet not enough to solve the problem.

Go and do some research and come up with some actual insights rather than expecting an algorithm to do the graft for you.

Next, most brands should forget about individual-level attribution and focus on the aggregate response. Chasing individuals around the internet produces more data than can be managed, yet not enough to solve the problem. Ten million impressions should have an effect on total sales or other conversion metrics, which can be quantified through marketing mix modelling. This should control for environmental effects and take offline media into account.

Third, define your audience based on fundamentals – have confidence in your understanding of your customers. If you know you don’t understand them, then go and do some research and come up with some actual insights rather than expecting an algorithm to do the graft for you. Also, remember that your user base can usually be defined by just a handful of properties. Any more than that and your segments will either be too small, too difficult to reach or too expensive.

Fourth, choose quality locations where you know your audience spend their online time. Don’t just rely on Google and Facebook, talk to publishers with an identifiable and committed user base. In Australia, JP Morgan Chase recently started manually preapproving sites, reducing the total number by almost 99% while maintaining performance.

Finally, set a clear long-term goal that corresponds to value created for the business. Know what success and failure look like and judge accordingly.

This is what we should expect from agencies. Finding clear insights based on analysis and understanding of consumer behaviour, developing a plan, implementing it, evaluating it against predicted results and repeating the process isn’t rocket science. Some might even call it marketing.

Andrew Willshire is the founder of analytics company Diametrical and formerly global director of advanced analytics at Maxus.