Re-thinking data expiration: Real time, right time and shelf-lives


There is a buzz about ‘real-time’ analytics and its likely impact on decision-making, customer relationships and profitability. It is, to be fair, new and exciting, and therefore undeniably attractive. Part of the excitement may be because everyone can see the potential, and the cynicism has not yet set in.

The projects so far include, for example, supporting demand planning in retail. They look interesting and are showing good early results on improving stock management and customer experience. There have been few (public) failures among these early projects. This is perhaps because there are very few projects that are more than just proof-of-concept, and the difficulty often lies in scaling up, or possibly because nobody is admitting to problems. Discussion still focuses on how to reap benefits from investments and potential investments.

From ‘real time’ to ‘right time’

There is, however, a sense that some commentators and practitioners are starting to take a long, hard look at the hype. It is not so much that they are discarding the potential of real time analytics, but more that they are starting to ask whether real time is always either necessary or even helpful. Many of the insights gained today come from data stored at source and analyzed later. This is partly because there is more of this type of data, but also because the delay means that quality can be assured, and suitable data preparation carried out.

This means that data can be used when most appropriate. The question is not so much whether real-time data analysis is possible, but whether real time is the same as ‘right time’ for that data and that insight. Data analysis needs to be done in a way that supports decision-making. Right time analytics therefore provides insights when they are needed, and to the right person. Sometimes this is in real time, but often it is not, or it requires a combination of real-time and stored data to generate useful insights, for example into the implications of customer behavior over time.



The answer to speeding up data analysis, and getting insights more promptly may therefore not always be to try to adopt real-time analytics. Instead, it may be to focus on improving data preparation and management capabilities. This is often the longest and most difficult part—the time-limiting step—in the analytical process. In other words, an improved data management platform can often allow an organisation to improve its analytics capabilities, because it enables better use of the existing data.

Missing the point?

But—and this is a big ‘but’—I think this might be missing a fundamental point. My point is that data, like food, has a use-by date. The longer it hangs about in your data store, the less up-to-date it is, and therefore potentially, less useful. In healthcare, for example, information about trends in someone’s health need to be recent. Data from a year ago will not be helpful, because the person’s condition may have deteriorated or improved considerably since then. But data covering the previous month or six months, showing a trend, is potentially extremely useful.

In practice, this means that spending time trying to find the ‘perfect’ use case for data, real-time or not, may be counterproductive. Borrowing from Nike, a useful slogan might be ‘just do it’: in other words, start experimenting with insights from data, and recognize that failure is good , because it allows you to move on to other options. Messing about with models, trying to perfect them, is actually the enemy of getting insights rapidly from data, before it reaches its sell-by date. Perfect is often the enemy of ‘good enough’, and nowhere is this truer than with data and modelling.

Value from immediacy

It is possible, therefore, that the real importance of real-time data lies in its immediacy. No, not that you need to make decisions in milliseconds, but that the data is as fresh and new as possible. It gives you the very latest insight into customer behaviour, stock systems, health or transaction history. Whether your decision-making is better served by analysing it immediately, in real time, or cleaning it up and using it soon after is irrelevant. The point is that the data is up-to-date, and has not reached its sell-by date.


Just like your carton of milk, you need to use data before it reaches its ‘use-by’ date. Being able to access real-time data supports this process. It is as simple as that.


About Author

Muhammad Asif Abbasi

Principal Business Solutions Manager

Asif has spent over 15 years in the Industry with focus on Big Data Analytics, Data Warehousing and Hadoop across multiple industries including Telco, Manufacturing, Finance & Utilities. He works with customers to help them get the best value out of their Hadoop investments and helps them use SAS to mitigate change management within the enterprise during Hadoop adoption. Asif is a Hortonworks Certified Hadoop Developer & Administrator, Oracle Certified Master, Sun Certified Enterprise Architect, Teradata Certified Master and SAS Certified Base Programmer.

Leave A Reply

Back to Top