Machine-to-machine data

0

I recently participated in an interesting recorded video web seminar with Scott Chastain from SAS about the concept of “big data quality” in which we discussed both the sources of big data streams as well as what could be meant by data quality for those big data sets. One conceptual source of big data is referred to as machine-to-machine (M2M) data, and this includes data sets automatically generated by devices such as sensors and meters that are then forwarded to other systems within a network.

Some examples of M2M data sources include energy meters (such as the emerging use of smart meters), pipe sensors (used in oil and gas transport), operational flight data generated by airplanes, continuous monitoring of vital signs in a hospital environment, transportation sensors (such as fluid and pressure monitors across fleets of trucks or railroad cars. Here are some thoughts about abstractly describing the archetypical scenario:

  • A networked environment composing a holistic system;
  • One or more attached devices that automatically monitor specific measure(s) associated with one or more system activities;
  • For each device, a defined period at which the measure is monitored and reported;
  • One or more target devices for collecting reported data;
  • A set of rules specifying the intended operation within a discrete set of expected behaviors; and
  • A set of actions to be taken when the system does not operate within the discrete set of expected behaviors.

Our examples above conform to these criteria. For example, in the hospital environment, you may have multiple (yet different) monitors such as pulse, blood pressure, blood oxygen content, rate of medication delivery, etc., connected to each patient. All monitors may be sampling measures at their prescribed rates, and each can forward the measures to centralized repository that scans for both individual health events requiring attention as well as collections of measures that are indicative of a systemic event (such as a localized power failure) that also need attention.

Fortunately, the sets of rules that describe both expected and deviant behaviors effectively describe the data expectations. That provides the data practitioners with the foundation for defining measures for ensuring that the generated data conforms to expectations. But does that really mean data quality? More on this in the next post.

Tags
Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Leave A Reply

Back to Top