As consumers, the quality of our day is all too often governed by the outcome of computed events. My recent online shopping experience was a great example of how computed events can transpire to make (or break) a relaxing event. We had ordered grocery delivery with a new service provider. Our existing provider
Tag: data quality
(Otherwise known as Truncate – Load – Analyze – Repeat!) After you’ve prepared data for analysis and then analyzed it, how do you complete this process again? And again? And again? Most analytical applications are created to truncate the prior data, load new data for analysis, analyze it and repeat

The adoption of data analytics in organisations is widespread these days. Due to the lower costs of ownership and increased ease of deployment, there are realistically no barriers for any organisation wishing to exploit more from their data. This of course presents a challenge because the rate of data analytics adoption

In my last blog I detailed the four primary steps within the analytical lifecycle. The first and most time consuming step is data preparation. Many consider the term “Big Data” overhyped, and certainly overused. But there is no doubt that the explosion of new data is turning the insurance business
The other day, I was looking at an enterprise architecture diagram, and it actually showed a connection between the marketing database, the Hadoop server and the data warehouse. My response can be summed up in two ways. First, I was amazed! Second, I was very interested on how this customer uses
I've been in many bands over the years- from rock to jazz to orchestra - and each brings with it a different maturity, skill level, attitude, and challenge. Rock is arguably the easiest (and the most fun!) to play, as it involves the least members, lowest skill level, a goodly amount of drama, and the
One thing that always puzzled me when starting out with data quality management was just how difficult it was to obtain management buy-in. I've spoken before on this blog of the times I've witnessed considerable financial losses attributed to poor quality met with a shrug of management shoulders in terms
.@philsimon looks under the hood of 'analytics.'
The data lake is a great place to take a swim, but is the water clean? My colleague, Matthew Magne, compared big data to the Fire Swamp from The Princess Bride, and it can seem that foreboding. The questions we need to ask are: How was the data transformed and
One of the common traps I see data quality analysts falling into is measuring data quality in a uniform way across the entire data landscape. For example, you may have a transactional dataset that has hundreds of records with missing values or badly entered formats. In contrast, you may have
In The Princess Bride, one of my favorite movies, our hero Westley – in an attempt to save his love, Buttercup – has to navigate the Fire Swamp. There, Westley and Buttercup encounter fire spouts, quicksand and the dreaded rodents of unusual size (RUS's). Each time he has a response to the
Financial institutions are mired with large pools of historic data across multiple line of businesses and systems. However, much of the recent data is being produced externally and is isolated from the decision making and operational banking processes. The limitations of existing banking systems combined with inward-looking and confined data practices
Small data is akin to algebra; big data is like calculus.
In the movie Big, a 12-year-old boy, after being embarrassed in front of an older girl he was trying to impress by being told he was too short for a carnival ride, puts a coin into an antique arcade fortune teller machine called Zoltar Speaks, makes a wish to be big,
If you are looking for a way to fund your data quality objectives, consider looking in the closets of the organization. For example, look for issues that cost the company money that could have been avoided by better availability of data, better quality of the data or reliability of the

Data Management has been the foundational building block supporting major business analytics initiatives from day one. Not only is it highly relevant, it is absolutely critical to the success of all business analytics projects. Emerging big data platforms such as Hadoop and in-memory databases are disrupting traditional data architecture in
In this blog series, I am exploring if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices. In Part 1, I provided a high-level definition of crowdsourcing and explained that while it can be applied to a wide range of projects
.@philsimon on the reliability of social numbers.
Once in a while, people run into an issue with the data that doesn't really need to be fixed right to ensure success of a specific project. So, the data issues are put into production and forgotten. Everyone always says, “We will go back and correct this later.” But that
Regulatory compliance is a principal driver for data quality and data governance initiatives in many organisations right now, particularly in the banking sector. It is interesting to observe how many financial institutions immediately demand longer timeframes to help get their 'house in order' in preparation for each directive. To the
In this blog series, I am exploring if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices. In Part 1, I provided a high-level definition of crowdsourcing and explained that while it can be applied to a wide range of projects
There are companies that have no data quality initiative, and truly do believe that if they see no data problem. In effect, they say that if it does not interfere with day-to-day business, then there is no data quality problem. From what I have seen in my consulting experience, it usually
Over my last two posts, I suggested that our expectations for data quality morph over the duration of business processes, and it is only at a point that the process has completed that we can demand that all statically-applied data quality rules be observed. However, over the duration of the
One of the significant problems data quality leaders face is changing people's perception of data quality. For example, one common misconception is that data quality represents just another data processing activity. If you have a data warehouse, you will almost certainly have some form of data processing in the form
In my last post, I pointed out that we data quality practitioners want to apply data quality assertions to data instances to validate data in process, but the dynamic nature of data must be contrasted with our assumptions about how quality measures are applied to static records. In practice, the

Utilizing big data analytics is currently one of the most promising strategies for businesses to gain competitive advantage and ensure future growth. But as we saw with “small data analytics,” the success of “big data analytics” relies heavily on the quality of its source data. In fact, when combining “small” and “big” data
@philsimon on the need to recognize DQ differences.
It’s common at the start of a new year to create a long list of resolutions that we hope to achieve. The reality, of course, is by February those resolutions will likely be a distant memory. The key to making any resolution stick is to start small. Create one small
After working in the data quality industry for a number of years, I have realized that most practitioners tend to have a rather rigid perception of the assertions about the quality of data. Either a data set conforms to the set of data quality criteria and is deemed to be acceptable
This isn't Kansas anymore. Oz has become a sprawling, smart metropolis filled with sensor data. How do we make sense of, clean, govern and glean value from this big data so we can get Dorothy home? The answer is SAS Data Management. With the latest portfolio updates, customers will be