Oh, how times have changed during my 20-plus years in the insurance industry. Data wasn’t a word we used much back in the 80s and 90s, unless of course you worked in those arcane and mysterious IT data centres.
Even amidst the computerisation of the insurance industry in the 80s, many policy documents and related files were still paper-based. In those days the data being captured came from the company’s own staff and was keyed into a mainframe terminal. The most sophisticated ones received a bordereaux file from their insurance brokers, transferred through a kind of electronic data interface or courier process – and more often than not this was just a floppy disk received in the post.
Today, insurance companies have more data than they can realistically handle. And the sources are numerous, including: in-house generated, aggregators, fraud bureaus, government agencies, telematics boxes and of course social media sites. This much data presents a pretty significant headache for insurance companies: where do you store it all, how do you access and understand what that data can tell you – and do so quickly. Most important of all, how do you profit from it?
What are the questions around “big data” that insurers today are grappling with?
Storing big data
First of all, lets define the size of the problem. We know that many insurers and insurance brokers produce anywhere between 300,000 and 700,000 quotes per day for the UK aggregator sites, and that telematics boxes can generate hundreds of data items per second. Deciding how much of this data to store and where to store it cheaply has become a serious issue for insurers. Cost is the primary challenge, but any storage also has to meet regulatory compliance needs, so you can’t just store this data anywhere.
Accessing and analysing big data
Storing all this data effectively and cheaply is only worthwhile if the business users can access it quickly. Only then can they can run the analysis and reports that they need. How do you do that when you have terabytes worth of data and many millions of records? Traditional data warehouse systems and Excel (the typical analysts’ tool of choice) struggle with these volumes, the speeds required and storing the growth in unstructured data. As such, insurers need to look at new ways to do this, such as Hadoop.
Visualising big data
For end users, one of the biggest problems with big data is simply being able to see the data fields and structures themselves. A few insurers have invested the time and effort to build good data dictionaries that help users to understand the data and associated metadata. But for everyone else, clearly labelled tables, fields and data values are extremely important if you want to get fast insights.
However, knowing the question is only half the problem. We also need to look at what insurers can do to address these issues. The first thing to know is that the answers are here today and are available for all insurance companies.
The data can now be quickly and cheaply stored in Hadoop and, with tools like SAS Visual Analytics, can be easily accessed and analysed by anyone with a mouse and a web browser. The ability to review and analyse vast quantities of data received from insurance aggregators in real time makes data insight immediately actionable. If you wanted to improve your quote to policy conversion, you could reduce your price by a percentage point. Or increase your price if, for example, you’ve reached your quota for young drivers. With technologies like SAS Event Stream Processing, this is a reality.
In the world of data, times have changed a great deal, but users need not fear the terms “big data” or “Big Data Analytics”. In those immortal words “we have the technology…”
If you want to know more about how to properly exploit big data using Hadoop, follow the eight-point checklist set out in this report.