Weed out your doubled-up data

Take a leaf out of the botanists’ book the next time you are cleaning up your database. In a sweeping review of how plants are classified worldwide, more than 600,000 species have been deleted from the official directory of flora. The purpose of this weeding out exercise has been to make it easier for experts to identify plants correctly.

Back in 2002, the members of the Convention on Biological Diversity made this rationalisation of species’ names their number one priority. It has taken three years to trim the one million listings down to 400,000.

For researchers looking into plants that could be used for food or medicines, the existing database was less than helpful. One example was a relative of the basil plant, Plectranthus. Looking for it using only the most-commonly used name would miss 80 per cent of the information about that plant family.

Sounds familiar? Data redundancy affects every database, whatever it is used for. In the marketing world, there is a near obsession about retaining every record and variable that has been collected, regardless of its utility and completeness. Data deletion policies are almost never written or enacted.

Yet a proper spring clean can yield significant benefits. For one thing, it makes marketing plans more accurate. Assume you have one million customers when the reality is half that number and your targets are nearly impossible to hit. You are also wasting a lot of resource going after the same people twice, who may then become less likely to purchase because of this duplication.

Regulatory strictures about maintaining files which can not be easily matched are one of the reasons for this hesitancy about deduplication. Since so many database managers and much of the technology they use originated within financial services companies, the culture has become standardised.

It takes a very bold company to say that it is going to nett down its customer base. City analysts might not take too kindly to it and could mark the share price down. That fear alone is enough to stop many data practitioners from doing what they know they should.

But as the shake-out of over-inflated asset prices comes to its tail end, the customer database may prove to be one place where there is still an exaggerated picture being painted of the business. Like accepting that six out of ten plants are basically the same as are described by four out of ten names, data needs to check that it is firmly rooted in reality.

Latest from Marketing Week


Access Marketing Week’s wealth of insight, analysis and opinion that will help you do your job better.

Register and receive the best content from the only UK title 100% dedicated to serving marketers' needs.

We’ll ask you just a few questions about what you do and where you work. The more we know about our visitors, the better and more relevant content we can provide for them. And, yes, knowing our audience better helps us find commercial partners too. Don't worry, we won't share your information with other parties, unless you give us permission to do so.

Register now


Our award winning editorial team (PPA Digital Brand of the Year) ask the big questions about the biggest issues on everything from strategy through to execution to help you navigate the fast moving modern marketing landscape.


From the opportunities and challenges of emerging technology to the need for greater effectiveness, from the challenge of measurement to building a marketing team fit for the future, we are your guide.


Information, inspiration and advice from the marketing world and beyond that will help you develop as a marketer and as a leader.

Having problems?

Contact us on +44 (0)20 7292 3703 or email customerservices@marketingweek.com

If you are looking for our Jobs site, please click here