Author

Jim Harris
RSS
Blogger-in-Chief at Obsessive-Compulsive Data Quality (OCDQ)

Jim Harris is a recognized data quality thought leader with 25 years of enterprise data management industry experience. Jim is an independent consultant, speaker, and freelance writer. Jim is the Blogger-in-Chief at Obsessive-Compulsive Data Quality, an independent blog offering a vendor-neutral perspective on data quality and its related disciplines, including data governance, master data management, and business intelligence.

Jim Harris 0
Data science and decision science

Data science, as Deepinder Dhingra recently blogged, “is essentially an intersection of math and technology skills.” Individuals with these skills have been labeled data scientists and organizations are competing to hire them. “But what organizations need,” Dhingra explained, “are individuals who, in addition to math and technology, can bring in

Jim Harris 0
The data that supported the decision

Data-driven journalism has driven some of my recent posts. I blogged about turning anecdote into data and how being data-driven means being question-driven. The latter noted the similarity between interviewing people and interviewing data. In this post I want to examine interviewing people about data, especially the data used by people to drive

Jim Harris 2
Being data-driven means being question-driven

At the Journalism Interactive 2014 conference, Derek Willis spoke about interviewing data, his advice for becoming a data-driven journalist. “The bulk of the skills involved in interviewing people and interviewing data are actually pretty similar,” Willis explained. “We want to get to know it a little bit. We want to figure

Jim Harris 0
The antimatters of MDM (part 5)

In physics, antimatter has the same mass, but opposite charge, of matter. Collisions between matter and antimatter lead to the annihilation of both, the end result of which is a release of energy available to do work. In this blog series, I will use antimatter as a metaphor for a factor

Jim Harris 0
Data quality and Paleolithic Rhythm

Early in the terrific book What Technology Wants by Kevin Kelly, he discusses the concept of Paleolithic Rhythm, which describes the short bursts of intense effort followed by long periods of rest employed by the hunter-gatherer tribes of early humans during the Paleolithic Era. Paleolithic Rhythm is also an apt analogy for how many

Jim Harris 0
Data quality in medias res

The planning and execution of enterprise information initiatives is definitely not easy. Building the business case involves identifying, documenting, verifying and refining a set of requirements that are representative of the various perspectives of the business and technical stakeholders all throughout the organization. Many such initiatives begin with the very

Jim Harris 0
A double take on sampling

My previous post made the point that it’s not a matter of whether it is good for you to use samples, but how good the sample you are using is. The comments on that post raised two different, and valid, perspectives about sampling. These viewpoints reflected two different use cases for data,

Jim Harris 4
Survey says sampling still sensible

In my previous post, I discussed sampling error (i.e., when a randomly chosen sample doesn’t reflect the underlying population, aka margin of error) and sampling bias (i.e., when the sample isn’t randomly chosen at all), both of which big data advocates often claim can, and should, be overcome by using all the data. In this

Jim Harris 0
What we find in found data

In his recent Financial Times article, Tim Harford explained the big data that interests many companies is what we might call found data – the digital exhaust from our web searches, our status updates on social networks, our credit card purchases and our mobile devices pinging the nearest cellular or WiFi network.

Jim Harris 0
The dark side of the mood

As an unabashed lover of data, I am thrilled to be living and working in our increasingly data-constructed world. One new type of data analysis eliciting strong emotional reactions these days is the sentiment analysis of the directly digitized feedback from customers provided via their online reviews, emails, voicemails, text messages and social networking

1 11 12 13 14 15 21