Cracking the code for a successful conversion: establishing security

What kind of security do we need for this conversion?  In fact, where are the security people? 

Including security personnel upfront in any conversion project can sure save some time and heartache later.  It is important to include security for the following:

  1. Source system access – You must be able to profile the source the data, check for quality issues, and attach any ETL or conversion programs to the source system.
  2. New platform (target) security for the data – Databases need the right security groups to be set up.  Also, consider the directory security on the server itself.
  3. User interface security – Who are the people that will require access to this application? In the project plan there is probably a task that refers to end user setup and security.  Consider adding to that deliverable a list of business users who will use this application.  Revisit this list as implementation gets closer and closer. Read More »
Post a Comment

Achieving persistent data governance, pt. 1: link your teams

During client conversations I often hear stories about past efforts to launch data governance that never reached critical mass and ended up being resized and marginalized. I find such outcomes fascinating... in the same vein as a car crash that causes you to tap on the brakes as you drive by.

When I hear stories of these failed projects, I always ask myself similar questions: How did this happen? What mistakes were made? How can I avoid this in my program? In this blog series I want to share a few thoughts on achieving persistent data governance and recommendations for avoiding a roadside emergency while on your governance journey. Read More »

Post a Comment

Video tutorial: 5 ways to instantly improve your data profiling performance

Data profiling is essential. So why do so many data quality teams fail to get the most out of this crucial technique? In my short video, you’ll discover the answers to unlocking the full potential of your data profiling efforts.

By broadening and deepening your knowledge of data profiling with new approaches to methodology and deployment, you'll realize numerous benefits such as:

  • Greater business impact
  • Faster decisions
  • Simpler profiling workflow

The answer lies with five simple techniques. Read More »

Post a Comment

A foxier way to search

What are all of the companies in San Francisco trying to make the Internet of Things happen? Google it if you like, but you're only like to get a simple list of companies, no doubt in an SEO-friendly order.

What if you could see those companies in a more comprehensive way? Better yet, what if you could filter and sort by Alexa Rank, headcount, company status (re: public vs. private), total revenue and other forms of structured data? And what if you could see how close those companies' offerings compete with each other in a very visual and interactive way? Read More »

Post a Comment

Can data change an already made up mind?

Nowadays we hear a lot about how important it is that we are data-driven in our decision-making. We also hear a lot of criticism aimed at those that are driven more by intuition than data. Like most things in life, however, there’s a big difference between theory and practice.

It’s easy to say that we will go where data drives us, but what happens if data is driving us to a destination that we’re uncomfortable with? What happens when data calls into question some of our long-standing beliefs?

We like to think that we are all natural data scientists who are ready, willing and able to be swayed by evidence presented by new data. And in a big data world we certainly do not suffer from a dearth of new data.

However, whether or not we want to admit it (especially to others), our minds are often already made up before we look at data. And big data makes a very good yes-man, amplifying our natural tendency to only search out data that supports our viewpoints so that we find further evidence for what we already believe.

This is known as confirmation bias, which, as Chip and Dan Heath, co-authors of Decisive: How to Make Better Choices in Life and Work explained, “leads us to hunt for information that flatters our existing beliefs.” They cited a recent meta-analysis of more than 91 psychological studies involving over 8,000 participants that concluded we are twice as likely to favor confirming information than disconfirming information. Read More »

Post a Comment

The value of reference data governance

In my last post, I shared some thoughts about challenges associated with the lack of management for reference data, such as reinterpretation of semantics and the inconsistencies that crop up when multiple copies are used. All of the challenges I mentioned are indications of a need for improving the enterprisewide governance of reference data.

The first steps in establishing governance involve assessing the current state and putting a management program in place. That program should include a framework for documenting the values and meanings of reference data management in a way that can be aligned with development of policies for governing use and sharing of those reference domains.

In turn, one can envision the benefits that can be derived through policy-driven reference data management, such as: Read More »

Post a Comment

On pronouns, online dating and data laziness

Working from home confers significant benefits. Two of my favorites are a two-second commute and the ability to take afternoon naps without offending judgmental coworkers. Among the drawbacks, though: I'm not going to randomly meet someone at the office.

Like many single professionals, I have dabbled in the world of online dating, a $2-billion annual industry. It's somehow become less creepy over the last five years to tell your friends that you and your significant other met online. Ask ten people and you'll receive ten different responses. In the end, online dating is a mixed bag. Know in advance that your mileage may vary, but you didn't come here for dating advice now, did you?

The data side of dating

For those unfamiliar with the process, when users and customers sign up on dating sites, they have to provide at least some basic information. Examples include age, location, gender and the like. Of course, there's no way for most sites to verify your identity. That is, you can claim to be younger, better looking and thinner than you actually are. Most sites will happily take your money and data with only a valid user name and credit card. Read More »

Post a Comment

Bring the noise, boost the signal

Many people, myself included, occasionally complain about how noisy big data has made our world. While it is true that big data does broadcast more signal, not just more noise, we are not always able to tell the difference.

Sometimes what sounds like meaningless background static is actually a big insight. Other times noisy big data just seems to make it more difficult for us to know good information when we hear it.

However, we might gain a greater appreciation for the noise buzzing within big data if we paused for some quiet contemplation about the critical role that noise plays in helping us hear what we want to hear. Read More »

Post a Comment

The hidden challenges of reference data

I sometimes refer to reference data as a “celebrity orphan” within an organization because reference data sets are touched by many business processes and applications, yet remain largely unowned and unmanaged. Few organizations have a truly formal methods for management and authority for reference data. This poses a conundrum: a widely used conceptual data asset is generally left to reimplementation whenever someone decides there is a need for a copy.

That introduces a number of different challenges for ensuring quality and consistent use of reference data, including:  Read More »

Post a Comment

Data quality in the real world

If you work in data quality long enough you’ll meet detractors of data quality software. The viewpoint in this camp is that poor quality data should be driven out at the time of design, not retrospectively detected and fixed. They perceive data quality tools as a costly overhead, something that is superfluous in a well-designed information landscape.

In a perfect world, perfectly designed systems would have defect prevention built in. All of the various design authorities would share the same vision for absolute quality management and developers would follow a strict "data quality rule book" and never create code that allowed defects to emerge. Read More »

Post a Comment