Consumption and usability

0

In my last post, I noted two key issues where there is the desire to impose governance over large-scale data sets imported from outside the organization: the absence of control and the absence of semantics. Of course, we cannot just throw up our hands and say that the data is ungovernable. Rather, we have to examine what the intent of governance is in light of these constraints.

One approach is to reframe the question, leading to some alternative approaches to governance. Instead of considering governance as a way of controlling the creation and processes that touch data within the production cycle, consider governance as a means for controlling expectations regarding consumption and usability of the data.

This is a more practical approach, especially considering that in most cases, reports and analyses driven by a big data approach are not likely to be slowed or halted as a result of questions raised about the processes used to create the source data. In addition, many big data environments may also be designed to stream data from real-time semi-structured or unstructured sources that either have no predefined metadata, or are subject to rapid changes in structure that may limit the ability to presuppose rules about formats and structure.

The orientation I am suggesting in this post covers two facets of data utilization. Consumption looks at the business scenarios in which the big data environment is used and what the expectations are from a high-level functional perspective. Usability refers to the degree to which the expected outcomes are skewed as a result of data issues and what the users’ level of tolerance is to that skew.

We can compare two different business applications. One is using numerous data sources for developing customer profiles for marketing purposes. With a large enough data set and a plethora of data attribute variables used for the analysis, there is some tolerance to missing or incorrect values because the ultimate results are still usable. And even if there are customers who are classified incorrectly, for the most part the marketing lift can still largely be achieved.

On the other hand, a big data analytics system for identifying fraud in real time must be much more sensitive to missing or incorrect data. Flagging reasonable transactions as fraudulent and denying them can have a negative impact on customer satisfaction, especially if preventing them from executing the transaction causes inconvenience or hardship.

In this vein, governance can be interpreted as the processes of identifying the usage scenarios, engaging the data consumers, and understanding their expectations in ways that can be asserted as measures of data usability. Continuous measurement and monitoring of those assertions can alert the business users if the quality of the data dips below their expectations. In this case, even if the data stewards would not necessarily change the data, they can inform the business users about the risks of using the results in relation to potential data flaws.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top