Implications of coalescing data quality

Over my last two posts, I suggested that our expectations for data quality morph over the duration of business processes, and it is only at a point that the process has completed that we can demand that all statically-applied data quality rules be observed. However, over the duration of the process, there are situations in which the rules might be violated yet the dynamic nature of the data allows for records to temporarily remain in a state that ultimately might be deemed invalid.

This stands in stark contrast to some other practitioners who state that there are only two points that are important in the life of a piece of data, namely the “moment of use” (at which the quality is presumably determined) and the “moment of creation” (at which the data must be created correctly). There are going to be many “moments of use” at different points in the processes that touch any piece of data, and those different uses will reflect different expectations depending on the business context.

And if, as I have suggested, the quality of data is both temporal and contextual, then it may be impossible to demand that the data be “correct” at its moment of creation, since the meaning of “correct” changes over time and in relation to different uses. Rather, we have to look at the assertion of data quality over a continuum, and understand the paths through which a process may meander and correspondingly, the impacts of the various data touch points along the way. Each touch point must carry its own “view” of quality presuming the uses that follow, and in turn, must apply the assertions that can be impacted at each stage of processing. (For more on data quality in a big data world, read my white paper: "Understanding Big Data Quality for Maximum Information Usability.")

By aligning data quality validation and monitoring with the dynamic nature of the underlying data, we can evolve a means for specifying the post-conditions (i.e., what quality means after the touch point) and the preconditions (i.e., what is acceptable prior to the touch point). These pre-conditions and post-conditions can then be folded into the design of the application implementing the business process and be directly integrated into the code to ensure a level of acceptable quality at all subsequent touch points.