Author

David Loshin RSS
President, Knowledge Integrity, Inc.

David Loshin, president of Knowledge Integrity, Inc., is a recognized thought leader and expert consultant in the areas of data quality, master data management and business intelligence. David is a prolific author regarding data management best practices, via the expert channel at b-eye-network.com and numerous books, white papers, and web seminars on a variety of data management best practices. His book, Business Intelligence: The Savvy Manager’s Guide (June 2003) has been hailed as a resource allowing readers to “gain an understanding of business intelligence, business management disciplines, data warehousing and how all of the pieces work together.” His book, Master Data Management, has been endorsed by data management industry leaders, and his valuable MDM insights can be reviewed at mdmbook.com . David is also the author of The Practitioner’s Guide to Data Quality Improvement. He can be reached at loshin@knowledge-integrity.com.

Data Management

David LoshinNovember 2, 2015 0

Using big data techniques to increase database performance

Many people perceive big data management technologies as a “cure-all” for their analytics needs. But I would be surprised if any organization that has invested in developing a conventional data warehouse – even on a small scale – would completely rip that data warehouse out and immediately replace it with an NoSQL

English

Data Management

David LoshinSeptember 8, 2015 0

Big data modeling: An iterative approach - Part 1

In my prior two posts, I explored some of the issues associated with data integration for big data and particularly, the conceptual data lake in which source data sets are accumulated and stored, awaiting access from interested data consumers. One of the distinctive features of this approach is the transition

English

Data Management

David LoshinAugust 24, 2015 0

Data integration considerations for the data lake: Standardization and transformation

In my last post, I noted that the flexibility provided by the concept of the schema-on-read paradigm that is typical of a data lake had to be tempered with the use of a metadata repository so that anyone wanting to use that data could figure out what was really in

English

Data Management

David LoshinAugust 12, 2015 0

Data integration considerations for the data lake: The need for metadata

A few of our clients are exploring the use of a data lake as both a landing pad and a repository for collection of enterprise data sets. However, after probing a little bit about what they expected to do with this data lake, I found that the simple use of

English

Data Management

David LoshinJuly 17, 2015 0

Data modeling for data policy management

Operationalizing data governance means putting processes and tools in place for defining, enforcing and reporting on compliance with data quality and validation standards. There is a life cycle associated with a data policy, which is typically motivated by an externally mandated business policy or expectation, such as regulatory compliance.

English

Data Management

David LoshinJune 30, 2015 0

Data policies and data governance

In recent years, we practitioners in the data management world have been pretty quick to conflate “data governance” with “data quality” and “metadata.” Many tools marketed under "data governance" have emerged – yet when you inspect their capabilities, you see that in many ways these tools largely encompass data validation and data standardization. Unfortunately, we

English

Data Management

David LoshinJune 4, 2015 0

Embedding event stream analytics

In my last two posts, I introduced some opportunities that arise from integrating event stream processing (ESP) within the nodes of a distributed network. We considered one type of deployment that includes the emergent Internet of Things (IoT) model in which there are numerous end nodes that monitor a set of sensors,

English

Data Management

David LoshinMay 27, 2015 0

Pushing event analytics to the edge

In my last post, we examined the growing importance of event stream processing to predictive and prescriptive analytics. In the example we discussed, we looked at how all the event streams from point-of-sale systems from multiple retail locations are absorbed at a centralized point for analysis. Yet the beneficiaries of those

English

Data Management

David LoshinMay 12, 2015 0

Holistic analysis and event stream processing

Over the past year and a half, there has been a subtle shift in media attention from big data analytics to what is referred to as the Internet of Things, or IoT for short. The shift in focus is not intended to diminish the value of big data platforms and

English

Data Management

David LoshinMay 8, 2015 0

Integration and publication: Data management for analytics

Once you have assessed the types of reporting and analytics projects and activities are to be done by the community of data analysts and consumers and have assessed their business needs and requirements for performance, you can then evaluate – with confidence – how different platforms and tools can be combined to satisfy

English

Data Management

David LoshinApril 15, 2015 0

Business needs and performance expectations: Data management for analytics

In the last few days, I have heard the term “data lake” bandied about in various client conversations. As with all buzz-term simplifications, the concept of a “data lake” seems appealing, particularly when it is implied to mean “a framework enabling general data accessibility for enterprise information assets.” And of

English

Data Management

David LoshinApril 9, 2015 0

Differentiating process, persistence, and publication: Data management for analytics

As part of two of our client engagements, we have been tasked with providing guidance on an analytics environment platform strategy. More concretely, the goal is to assess the systems that currently compose the “data warehouse environment” and determine what the considerations are for determining the optimal platforms to support

English

Data Management

David LoshinMarch 10, 2015 0

Using Hadoop: Emerging options for improved query performance

In my last two posts, we concluded two things. First, because of the need for broadcasting data across the internal network to enable the complete execution of a JOIN query in Hadoop, there is a potential for performance degradation for JOINs on top of files distributed using HDFS. Second, there are

English

Data Management

David LoshinMarch 3, 2015 0

Using Hadoop: Query optimization

In my last post, I pointed out that an uninformed approach to running queries on top of data stored in Hadoop HDFS may lead to unexpected performance degradation for reporting and analysis. The key issue had to do with JOINs in which all the records in one data set needed

English

Data Management

David LoshinFebruary 19, 2015 0

Using Hadoop: Impacts of data organization on access latency

Hadoop is increasingly being adopted as the go-to platform for large-scale data analytics. However, it is still not necessarily clear that Hadoop is always the optimal choice for traditional data warehousing for reporting and analysis, especially in its “out of the box” configuration. That is because Hadoop itself is not

English

Data Management

David LoshinFebruary 2, 2015 0

Implications of coalescing data quality

Over my last two posts, I suggested that our expectations for data quality morph over the duration of business processes, and it is only at a point that the process has completed that we can demand that all statically-applied data quality rules be observed. However, over the duration of the

English

Data Management

David LoshinJanuary 26, 2015 0

Dynamic data and coalescing quality

In my last post, I pointed out that we data quality practitioners want to apply data quality assertions to data instances to validate data in process, but the dynamic nature of data must be contrasted with our assumptions about how quality measures are applied to static records. In practice, the

English

Data Management

David LoshinJanuary 16, 2015 0

Static Models and Dynamic Data

After working in the data quality industry for a number of years, I have realized that most practitioners tend to have a rather rigid perception of the assertions about the quality of data. Either a data set conforms to the set of data quality criteria and is deemed to be acceptable

English

Data Management

David LoshinDecember 23, 2014 0

Creating the MDM demo

With our recent client engagements in which the organization is implementing one or more master data management (MDM) projects, I have been advocating that a task to design a demonstration application be added to the early part of the project plan. Many early MDM implementers seem to have taken the

English

Data Management

David LoshinDecember 16, 2014 0

Master Data Access Use Case #2: The Composite Record

In the last post we looked at the use case for master data in which the consuming application expected a single unique representative record for each unique entity. This would be valuable in situations for batch accesses like SQL queries where aggregates are associated with one and only one entity record.

English

Data Management

David LoshinDecember 9, 2014 0

Master data access use case #1: The unique record

Last time I suggested that there are some typical use cases for master data, and this week we will examine the desire for accessibility to a presumed “golden” record that represents “the single source of truth” for a specific entity. I put both of those terms in quotes because I

English

Data Management

David LoshinDecember 2, 2014 0

Demonstrating master data accessibility

I have probably touched on this topic many times before: accessing the data that has been loaded into a master data environment. In recent weeks some client experiences are really highlighting something that is increasingly apparent (and should be obvious) for master data management: the need to demonstrate that it

English

David LoshinSeptember 30, 2014 0

What is reference data harmonization?

A few weeks back I noted that one of the objectives on an inventory process for reference data was data harmonization, which meant determining when two reference sets refer to the same conceptual domain and harmonizing the contents into a conformed standard domain. Conceptually it sounds relatively straightforward, but as

David LoshinAugust 26, 2014 0

Reference data as metadata

In my last set of posts I started to look at some of the challenges associated with enterprise management of reference data domains, especially as the scope of use for the same conceptual reference domains expands across databases, systems, and functional areas within the organizations. Recognizing the value of capturing

Data Management

What is reference data? IT person works from laptop.

David LoshinJuly 29, 2014 0

What is reference data?

David Loshin defines reference data and sets up a working definition for his next set of posts.

David LoshinFebruary 4, 2014 0

Integration planning for master data management

A few years ago, I was presenting a morning course on master data management in which I shared some thoughts about some of the barriers to success in transitioning the use of a developed master data management index and repository into production systems. During the coffee break, an attendee mentioned

David LoshinDecember 24, 2013 0

Behavior architecture

In the past few weeks I have presented training sessions on data governance, master data management, data quality and analytics at three different venues. At each one of these events, during one of the breaks a variety of people in my course noted that the technical concepts of implementing programs

David LoshinDecember 17, 2013 0

Behavior modeling

In my last post I introduced the term “behavior architecture,” and this time I would like to explore what that concept means. One approach is to start with the basics: given a business process with a set of decision points and a number of participants, the behavior architecture is the

David LoshinDecember 3, 2013 0

Behavior engineering

Instituting an analytics program in which actionable insight is delivered to a business consumer will be successful if those consumers are aware of what they need to do to improve their processes and reap the benefits. As we have explored over the past few posts, success in the use of

David LoshinNovember 26, 2013 0

Data governance and big data

The data quality and data governance community has a somewhat disconcerting habit to want to append the word “quality” to every phrase that has the word “data” in it. So it is no surprise that the growing use of the phrase “big data” has been duly followed by claims of

Previous 1 2 3 4 Next

Blogs

Blogs

Author