Data takes to the cloudsData takes to the clouds

Soaring data volumes, greater analytical needs, declining budgets. David Reed finds out how you can resolve this awkward equation.

Moore’s Law has provided great comfort to IT directors. It states that, in effect, the processing power of computer chips doubles every two years. So the solution to the increasing number of business-critical applications has been relatively simple – provide a new dedicated server.

Even the explosion of data volumes in recent times has not been seen as a problem. Reed’s Law says that data doubles every 18 months. Since hard drive storage has been relatively cheap, the solution to this has been just to line up more hard drives.

But these two forces have come to a tight pass. Business users now want processing-intensive tools to apply to storage-heavy data volumes, especially to support analytical data warehouses. Data centres are growing out of space, energy and budget.

So what next? The consolidation and virtualisation of servers and disk arrays offers a short-term fix. According to research by IDC, the server market has hit an inflection point, with the number of virtual machines being shipped exceeding the number of physical servers in 2008. This year, VM is expected to exceed physical hardware by 10 per cent.

The pressures of exponential data volume growth and downward pressure on budgets is not going away. Something new is required, which is why many organisations are now looking at the possibilities offered by a new generation of low-cost, outsourced data centres and potential migration into a cloud environment.

“I’ve been waiting for this for a decade,” says Songnian Zhou, CEO of Toronto-based Platform Computing. He started his business in 1997 to transfer the grid computing approach developed by the University of Toronto into the commercial world. Since then, the company has been supporting the massive data processing requirements of organisations like computer chip designer ARM and the UK Nuclear Agency. Commercial businesses are just starting to pick up on the proposition, which is now morphing into a cloud approach.

“One mid-sized bank here passes all of its data to us for processing to look at whether the day’s business was good or bad and is now getting a lot better understanding of its results,” says Zhou. Faced with rising data volumes, the bank had been struggling to schedule the processing time needed for this critical business intelligence.

What Platform offers is a cost-effective way to leverage high-volume data for real-time analytics that avoids the redundancy which is typically built-in by IT departments when provisioning dedicated servers to each application. “It is no coincidence that cloud is being discussed now,” says Zhou.

One of the best kept secrets in the industry is that Platform Computing provides the infrastructure behind the SAS Institute’s on-demand proposition. Its software will be powering the new 1,000 server data centre which SAS is building to deliver its business analytics as a service.

Zhou says that previous inhibitions felt by boards about letting their data outside the corporate firewall are rapidly evaporating. “They have been using software-as-a-service for some time. Many already use third parties to screen their emails for viruses, for example,” he says.

As the long-term relationship between SAS and Platform demonstrates, there is nothing new about putting business critical applications or data into an outsourced environment. The company does not need to be able to touch these assets in order to make use of them.

It is a paradox that cloud computing of this sort is starting to be discussed just as some of these solutions reach maturity. BT has just re-signed with Kognitio to provide a datawarehouse-as-a-service to its pricing department giving BT Business insight into its customers’ calling behaviours so that it can design propositions to meet their needs. This continues a relationship which was initiated ten years ago.

“That was long before the terms SaaS, DaaS or cloud computing had been coined,” notes Arthur Winn, head of pricing at BT Business.
The scale of the data warehouse is considerable. BT Business provides communications and IT services to more thanhas around 1 million business customers whose calling records across 15 months are analysed by the pricing department. “We look at how many calls they have made, the cost, margin and package they are on when we are doing a repricing job. That wealth of data is required if we want to retain and acquire customers,” says Winn.

Value is driven out of the investment in the service in a variety of ways. “Not needing to use Kognitio’s consultants for more complex queries using SQL is one. We can create Business Objects queries ourselves and interrogate the data. Kognitio has developed an Excel derivative, Pricing Edge, that allows us to look at call repricing options,” says Winn.

Even two years ago, this would have required Kognitio’s consultants to write SQL queries – creating a do-it-yourself tool for BT’s analysts has led to cost-savings. At BT Business, pricing sits within the marketing department, underlining its importance in the customer relationship and the critical nature of the analyses being carried out.

There are 20 direct users of the solution, but many more indirect ones who ask Winn’s teams to analyse specific issues on their behalf. “That shows the appetite for analytics. In many organisations, I believe that product managers struggle to get access to data in this way,” he says.

The cost of using a DaaS solution is “not trivial”, in Winn’s words, but it is justified for a FTSE 100 business in a highly competitive marketplace. It also relieves the pressure on in-house IT departments at the same time as lifting a capital expenditure need, without losing any control.

Telcos, mobile networks and ISPs are undoubtedly at the sharp end of the collision between increasing data volumes, heightened analytical demands and reduced budgets. Analytical data warehouses to examine call data records (CDRs) are not cheap, but they are business critical.

Orange was struggling with the fact that its business reports were being generated six to eight weeks behind time. “That is not very helpful to run the business,” says Alan Cornwell, vice-president, EMEA operations for Netezza, which supplied the analytical data warehouse appliance which resolved the problem.

“When they put it in, they got turnaround on their queries in 90 seconds. That got a ‘wow’ reaction because they didn’t think it was possible,” he says. Orange generates 20 to 30 million CDRs every day and each of those records carries a depth of variables. Processing that for immediate business needs was challenging, let alone storing it for longer-term analysis of trends.

“One of the things they needed to do was reconcile CDRs for billing, but they were often finding they couldn’t. Between 8 and 10 per cent of records required a lot of manual reconciliation to figure out who should be billed and at what price, particularly when calls passed through multiple networks, like foreign calls,” says Cornwell.

That pricing work needed to be done within the monthly billing cycle, yet there was often a residue which could not be reconciled and was therefore never billed. By enabling this analysis, previously unclaimed revenue was generated.

Cornwell says that when Virgin Media deployed a Netezza appliance on revenue assurance and fraud detection, the returns covered the cost inside of four months. For a solution that typically uplifts analytical performance between 10 and 100-fold – and even 1,000-fold in some examples – that is a worthwhile investment.

From revenue optimisation, many users are now looking at customer churn where high volumes of data need to be examined in order to spot trends and shifts in behaviour. Cornwell says that social network analysis will be the next big adoption area. “Networks are looking for ‘queen bees’ who may not be high net value customers themselves, but they are huge influencers over what to buy,” he says.

Netezza’s proposition is a non-disruptive solution that plugs into existing IT infrastructures and runs industry standard business intelligence tools, such as Business Objects, Microstrategy or Cognos, using familiar ETL tools to find the data needed. “We are not installing any new applications, just an appliance on top of what is already in use,” he says.

You do not have to be running multi-variate analyses on billion row data sets to start thinking about an outsourced data warehouse. For many marketing departments. The continuing struggle with in-house resources – from IT support to campaign managers – is making new third party solutions ever more attractive.

One of these is MarketingEdge-X from Acxiom. Putting the power of its grid computing in UK data centres together with the depth of its data sets for customer data enhancement and prospecting, all for a monthly subscription with no up-front costs, is getting a lot of attention.

“One prospect said it is a no-brainer,” says marketing strategy consultant Shelagh Regester. “They’ve got a big project planned, but they need something to help them as soon as possible.”

The company has signed up for ME-X with agreed service levels, break points with low decommissioning costs and one eye on de-risking the decision for the individual involved. “Acxiom is confident that it can give them something that will lead to a long relationship, s it is using this pricing model,” says Regester.

Elsewhere, Web-based software as a service is creating breakthrough solutions like MyFeeds from Postcode Anywhere. The company had already created a SOA environment at its launch nine years ago. “We were doing PAF as a service based on that, putting large commercial data sets in the cloud,” says IT director James Turner.

The number of data sets being delivered in this way has expanded, as have client needs. “The idea of MyFeeds is to do the same thing with data sources that may be on the client’s server behind their firewall as we do with data sources on our own server estate and present it all via a front-end Web service,” says Turner.

That means clients can effectively “mash-up” their own data with commercial sets, for matching, screening, enhancement or anything else, without having to build a dedicated data mart. The fourth-generation of Postcode Anywhere’s platform makes building the ETL routines straightforward.

While not technically a cloud – Turner is wary of migrating his company into environments that can sometimes lay claim to the IP rights of anything they hold – My Feeds is a prime example of the latest generation of data solutions. You can see them and work with them, you just can’t touch them.