Companies with usage-based billing models often underestimate how much revenue they are leaking because of dirty data. So how can companies turn off the tap on revenue leakage?
In this episode, host Behdad Banian is joined again by Jonas Wallenius who shares another tale from his Library of Pain. They look at the case of a well-established company with a serious case of revenue leakage and explore how they eventually gained revue assurance.
The last time I was here, we spoke about growth inhibitors. But we also have profit killers, hurting your bottom line, one of them being revenue leakage. We also have transformation and optimization blockers, which are more about saving time and being more efficient.
So we had a client who was offering wifi connectivity to travelers. It could be on boats, aircraft, trains, buses, and others. It was a complex setup, they had satellite connectivity, multiple countries, multiple vendors, generations of hardware, and other elements.
They had to track the customer usage, both to bill the customer and to pay these partners. And this was not a small company either, they had around one billion USD in annual revenue.
Usage data typically came from log files from wifi terminals, satellite operators, telco operators, from vehicles, and they had around 25 different log files and thousands of sources to collect from. Some of this data had to go through multiple networks, and firewalls, some were stored on the way, some of this was automated, and some were done manually. Sometimes network configurations changed and what was coming through just stopped.
They had some shell scripts that they ran periodically, to fetch this data and store them in the database.
And of course, they had an Excel Guy who took all of these logs and converted them into CSV files, and inserted them into multiple databases. There were even some cases where people mailed the usage data to the Excel Guy.
In the name of redundancy, this company had three parallel usage data feeds, which never produced the same usage data on any given day.
Unfortunately, no. This was just the usage data collection problem, after that, they had the bill run. At the end of the month, someone, let’s say Fiona from Finance had to produce an invoice. Now Fiona would send an email asking an SQL guy if they can run their scripts to produce aggregate usage data for billing.
The SQL guy ran scripts across multiple databases, and he had to fiddle with his scripts since the usage data came from different sources. So you never ran the exact same scripts two months in a row.
To complicate this further, they had custom deals with different terms for almost every customer.
So the SQL Guy would create Excel sheets for every customer, zip them all up, and mail it back to Fiona from Finance, who would put that in a folder to be picked up by the billing system.
We started with the basics, connect and collect, getting the usage data under control. Fortunately, most of what they had was possible to automate. We could automatically connect to these sources and fetch the data and make sure sketchy connections didn’t affect data quality.
Then we turned to the three data feeds, which were now better, with automated data collection, but were not perfect. So we sat down with them to figure out how they chose which feed to use on a given day, which turned out to be a combination of fairly simple rules and some heuristics.
We built an automated data quality validator that showed which data feed was the best. And soon we could get the data quality to acceptable levels.
Then we went to our usual work, by correlating, validating, and cleaning the data we reduced the 25 different sources into one single format and we made it so that broken or corrupt records would be stored for manual reconciliation.
And then finally, this high-quality data was pushed into their billing system automatically.