Data quality on Hadoop: The easy way

Bigger doesn’t always mean better. And that’s often the case with big data. Your data quality (DQ) problem – no denial, please – often only magnifies when you get bigger data sets.

Having more unstructured data adds another level of complexity. The need for data quality on Hadoop is shown by user feedback in the latest TDWI Best Practices Report "Hadoop for the Enterprise." 55% of the respondents plan to integrate DQ in the next three years. So how to take care of big data quality?

Well, there is good news. You don’t have to learn Java and implement complex MapReduce code to fix quality issues in the data within a Hadoop cluster. SAS Data Loader for Hadoop comes with data quality directives that help business users detect and repair data problems quickly and easily.

Data Loader Directives-DQ

SAS Data Loader for Hadoop including the data quality directives

Read More »

Post a Comment

Can big data be governed?

Yes. For those keeping score at home, this is my second post in a row starting with a one-word answer to its questioning title. In this case, it’s a question that’s asked a lot and for good reason since big data raises big questions for all data-related disciplines. Read More »

Post a Comment

Drafting an operating model for data governance

The data governance “industry” thrives on a curious dichotomy. On the one hand, some service providers insist to clients that they need a data governance program, that they must create a data governance council and that they should immediately staff a collection of roles ranging from data governance council member to data steward. This has led to a promulgation of the ubiquitous stack of PowerPoint slides that grace each data governance manager’s desk. But the creation of an org chart does not guarantee that the quality and integrity of data will automatically improve. Read More »

Post a Comment

Data governance: The human resources analogy

Explaining data governance to a business community is difficult. Even more so when you need to convince business folks that they are pivotal to data governance success.

Data governance demands not just business attention but business commitment. Policies and processes are not just tick boxes on a corporate charter; they are business as usual functions that everyone must sign up for.

I find that data governance is such an alien term for most business leaders that an analogy helps to create some form of comparison to bridge the gap. Read More »

Post a Comment

Data modeling for data policy management

Operationalizing data governance means putting processes and tools in place for defining, enforcing and reporting on compliance with data quality and validation standards. There is a life cycle associated with a data policy, which is typically motivated by an externally mandated business policy or expectation, such as regulatory compliance.

Read More »

Post a Comment

Common problems and pitfalls when starting a data governance project

Guess what? Data governance can be considered a bottleneck and a bothersome activity at some organizations. So let’s discuss how NOT TO BE the BOTTLENECK. Defining what the data governance initiative will entail is very important here.

Read More »

Post a Comment

What Zappos' bold moves teach us about data governance

If there's a non-casino flagship company here in Las Vegas, it's clearly Zappos, known for legendary customer service. Some have even called the way in which Zappos treats its customers insane. (Think 9-hour calls and free pizza deliveries. I'm not joking.)
Read More »

Post a Comment

Are data governance and MDM still inseparable?

Yes. But since this post needs to be more than a one-word answer to its title, allow me to elaborate.

Data governance (DG) enters into the discussion of all enterprise information initiatives. Whether or not DG should be the opening salvo of these discussions is akin to asking whether the chicken or the egg came first. However, any initiative believing its manifest destiny is to expand across the organization and pervade every nook and cranny of the enterprise eventually needs DG. Master data management (MDM) is no exception. Read More »

Post a Comment

Struggling with data governance alignment? Look to history.

If your organization is large enough, it probably has multiple data-related initiatives going on at any given time. Perhaps a new data warehouse is planned, an ERP upgrade is imminent or a data quality project is underway.

Whatever the initiative, it may raise questions around data governance – closely followed by discussions about the need to "align" with the business. Aligning data governance to business value is where many initiatives falter, because it is not always easy to demonstrate tangible value. Read More »

Post a Comment