What is a graph model?

1

In my previous post I started a discussion of graph analytics in which connections and links among different types of entities can be analyzed to find patterns that lead to actionable intelligence. But before we can explore the details of the types of analyses to be performed, we must first get grounded in how connectivity networks are modeled and managed.

Graph analysis is driven by a model that can capture the links between entities, and a model of entities and connections can be elegantly described using a mathematical abstraction known as a as a graph. A graph is defined as a collection of vertices (each representing an entity) connected using labeled edges. Each connection in a graph can be described using three pieces of information: the two entities and the type of the edge that connects them. As a simple example, if John Smith works for International Widget Corporation, then John Smith and International Widget Corporation would be represented as vertices with one edge connecting those vertices representing the connection.

The graph model abstraction is straightforward, yet provides broad flexibility. Adding attributes to the vertices and edges linking them together adds context about what relationships represented within the graph actually mean. For example,

  • Labeling vertices describes the types of entities in the graph, such as a “husband” or “employee.”
  • Edges can be labeled with the nature of the relationship, such as “is married to” or “works for.”
  • Edges can be directed to indicate the “flow” of the relationship, such as when one person is a child of another person.
  • Magnitudes can be added to the relationships represented by the edges.
  • Additional properties (such as durations) can be attributed to both edges and vertices.

Two entities may share more than one edge representing multiple relationships, such as when a party is a customer of a company and is also an employee of that company.

This abstraction can be practically represented in different ways. Although a relational database can be used to represent a graph, there are emerging NoSQL technologies that enable both dynamic programmatic representations as well as persistent representations. These graph databases are designed to support the types of queries and analyses one might want to perform on a network – more on this next time.

Share

About Author

David Loshin

President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

1 Comment

Leave A Reply

Back to Top