Most people have logged on to a social media site, maybe to look up an old friend, acquaintance or family member. Some people play games, or post funny pictures or other information they want to share with everyone. Do you ever ask yourself what happens with this information? What if your business wanted to purchase this information and
Tag: data quality
In 2014, big data was on everyone’s mind. So in 2015, I expected to see data quality initiatives make a major shift toward big data. But I was surprised by a completely new requirement for data quality, which proves that the world is not all about big data – not
Sometimes when trying to fuzzy match names you want to fuzzy match just a portion of the name: for example, Family Name and/or Given Name. A common mistake that people make is to feed in the Family Name and Given Name columns separately into the Match Codes node instead of
Confusion is one of the big challenges companies experience when defining the data governance function – particularly among the technical community. I recently came across a profile on LinkedIn for a senior data governance practitioner at an insurance firm. His profile typified this challenge. He cited his duties as: Responsible for the collection
To prepare for the data challenges of 2015 and beyond, health care fraud, waste and abuse investigative units (government funded and commercial insurance plans, alike) need a data management infrastructure that provides access to data across programs, products and channels. This goes well beyond sorting and filtering small sets of
Data quality issues don’t go away just because you have more data. Big data is sometimes considered exempt from the requirement to be integrated, cleansed and standardized. Unfortunately, chances are that the more data you have, the worse its quality will become.
.@philsimon on bridging the IT-business divide once and for all.
As a youngster in the 70s and 80s, Star Trek inspired my imagination and fostered a great love for science, technology and reading. (See the embedded Star Trek infographic for some interesting factoids – did you know that there were 28 crew member deaths by those wearing red shirts?) Captain Kirk and the
Data integration, on any project, can be very complex – and it requires a tremendous amount of detail. The person I would pick for my data integration team would have the following skills and characteristics: Has an enterprise perspective of data integration, data quality and extraction, transformation and load (ETL): Understands
Integrating big data into existing data management processes and programs has become something of a siren call for organizations on the odyssey to become 21st century data-driven enterprises. To help save some lost time, this post offers a few tips for successful big data integration.
There is a time and a place for everything, but the time and place for data quality (DQ) in data integration (DI) efforts always seems like a thing everyone’s not quite sure about. I have previously blogged about the dangers of waiting until the middle of DI to consider, or become forced
“Garbage in, garbage out” is more than a catchphrase – it’s the unfortunate reality in many analytics initiatives. For most analytical applications, the biggest problem lies not in the predictive modeling, but in gathering and preparing data for analysis. When the analytics seems to be underperforming, the problem almost invariably
Bigger doesn’t always mean better. And that’s often the case with big data. Your data quality (DQ) problem – no denial, please – often only magnifies when you get bigger data sets. Having more unstructured data adds another level of complexity. The need for data quality on Hadoop is shown by user
.@philsimon on whether companies should apply some radical tactics to DG.
If your organization is large enough, it probably has multiple data-related initiatives going on at any given time. Perhaps a new data warehouse is planned, an ERP upgrade is imminent or a data quality project is underway. Whatever the initiative, it may raise questions around data governance – closely followed by discussions about the
In recent years, we practitioners in the data management world have been pretty quick to conflate “data governance” with “data quality” and “metadata.” Many tools marketed under "data governance" have emerged – yet when you inspect their capabilities, you see that in many ways these tools largely encompass data validation and data standardization. Unfortunately, we
After doing some recent research with IDC®, I got to thinking again about the reasons that organizations of all sizes in all industries are so slow at adopting analytics as part of their ‘business as usual’ operations. While I have no hard statistics on who is and who isn’t adopting
As consumers, the quality of our day is all too often governed by the outcome of computed events. My recent online shopping experience was a great example of how computed events can transpire to make (or break) a relaxing event. We had ordered grocery delivery with a new service provider. Our existing provider
(Otherwise known as Truncate – Load – Analyze – Repeat!) After you’ve prepared data for analysis and then analyzed it, how do you complete this process again? And again? And again? Most analytical applications are created to truncate the prior data, load new data for analysis, analyze it and repeat
The adoption of data analytics in organisations is widespread these days. Due to the lower costs of ownership and increased ease of deployment, there are realistically no barriers for any organisation wishing to exploit more from their data. This of course presents a challenge because the rate of data analytics adoption
In my last blog I detailed the four primary steps within the analytical lifecycle. The first and most time consuming step is data preparation. Many consider the term “Big Data” overhyped, and certainly overused. But there is no doubt that the explosion of new data is turning the insurance business
The other day, I was looking at an enterprise architecture diagram, and it actually showed a connection between the marketing database, the Hadoop server and the data warehouse. My response can be summed up in two ways. First, I was amazed! Second, I was very interested on how this customer uses
I've been in many bands over the years- from rock to jazz to orchestra - and each brings with it a different maturity, skill level, attitude, and challenge. Rock is arguably the easiest (and the most fun!) to play, as it involves the least members, lowest skill level, a goodly amount of drama, and the
One thing that always puzzled me when starting out with data quality management was just how difficult it was to obtain management buy-in. I've spoken before on this blog of the times I've witnessed considerable financial losses attributed to poor quality met with a shrug of management shoulders in terms
The data lake is a great place to take a swim, but is the water clean? My colleague, Matthew Magne, compared big data to the Fire Swamp from The Princess Bride, and it can seem that foreboding. The questions we need to ask are: How was the data transformed and