Use Case: Data Anonymisation

A Mass Data Use Case

Cable & Wireless leverages the cloud to drastically reduce the cost of data analytics using DigitalRoute’s MediationZone

Background

Cloud computing can deliver disruptive changes to the field of business intelligence. Leveraging open-source technologies such as Hadoop and renting space on servers can provide powerful data mining functionality at a fraction of the cost of traditional approaches. To better understand this emerging field, Cable & Wireless, in partnership with Digital Route, created a proof of concept to identify the benefits of the cloud computing for service providers in the telecommunications market.

It was known at the outset that the cloud offered access to processing power at an unprecedented scale and with radically different cost-levels. For telecommunications companies this represents an attractive opportunity to reduce costs and simultaneously gain much better insight into what customers are doing on networks which is urgent considering the large volumes of data now being generated by new mobile devices.

However, issues are also evident in spite of cloud's obvious advantages. For one, with so much usage detail available, security and privacy have become concerns. There is thus a fine balance between offering relevant packages and breaching someone’s trust.


The MediationZone Solution

Digital Route

In order to enable C&W to utilize the Amazon cloud, MediationZone tokenized all user-sensitive data, which resulted in the system replacing phone-numbers with tokens and also filtering, formatting and compressing data so that its original size was reduced by an order of about 58 times. This was done by removing all fields that were not relevant and adding compression (gzip in this case), which on average achieved a 90% reduction.

The tokenization process involved generating a key/value repository which is maintained in a locked down environment locally so that only with this information can the process be reversed again. Tokenization makes data far less sensitive from a privacy perspective without limiting possible benefits for analytics.

A native MediationZone Amazon connector was used to upload the data directly to Amazon S3 storage. From there C&W used Amazon’s Elastic MapReduce, which implements Hadoop to carry out data analysis. One of the beauties of AWS is that it comes with Hadoop pre-configured, so starting a job even with hundreds of servers involves just a few clicks.

After data processing was completed, results were collected from Amazon by MediationZone and customer information was de-tokenized in order to enable customer specific marketing campaigns. 


The Result

After various test-runs, uploads and other steps, the total bill for hardware including cost to transfer data, storage and processing stood at $18!  In comparison, service providers can easily spend $25,000 plus on just a smallish Business Intelligence server once hardware, disk software licenses and running resources have been taken into account. And of course most hardware sits idle the majority of the time, which means unnecessary energy wastage.

So what does the DigitalRoute solution mean for businesses? The first and most important conclusion is that the availability of cost-efficient technologies can enable even the smallest company to gain access to large-scale data mining and intelligence functionalities.

For telecommunications companies, which typically have regulatory demands related to storing and retrieving large amounts of data, moving to ‘nosql’ technologies (as these are referred to) can offer a very compelling cost alternative.

The tokenisation of data enables the use of these new technologies more freely. For example companies that are highly regulated, such as telecoms, finance and insurance can use tokenized data to see trends, find big spenders and analyze behaviors in a way that they could not do without tokenisation. Should they at any given point need to reverse the data, the data can be de-tokenized by accessing the local and secure key/value pairs.

DigitalRoute prides itself on knowing data and knowing how to help you make use of it.