The last thing you want to result from a data warehouse consolidation project is the creation of yet another siloed data asset that must be populated and managed with respect to the requirements of the downstream users.
To really benefit from a consolidation project, your newly-created consolidated warehouse should replace the data marts and warehouses that are used to populate it, and those systems should be scheduled for retirement. That said, it is worth suggesting some success criteria to specifically address the types of issues I raised in my previous post. Some of those success factors include:
- Inclusiveness of the target data model: An inclusive data model must accommodate the union of the data attributes that exist across the data warehouses targeted for consolidation.
- Elimination of duplicates: Eliminating duplicated records helps to reduce the storage footprint and will improve the degree of coherence for downstream applications.
- Consistency in deduplication: It is extremely important to ensure consistency when attempting to merge a pair of records that is perceived to represent the same real-world entity.
- Data validation and standardization: One of the failures associated with the proliferation of data warehouses is inconsistent application of data validation and standardization rules to data that is delivered to the data warehouse, and any consolidation effort must ensure consistency in applying data quality rules.
- Synchronization: Issues with currency and timeliness of data delivery to different data warehouses should be resolved as a result of the consolidation.
- Semantic consistency: This is a very common issue that is frequently ignored, and it often creates inconsistencies that can impact all downstream users. Similarly-named data elements in different data models are often presumed to mean the same thing. But subtle definition differences may lead to more obtuse inconsistencies when data sets are merged. The consolidation effort must employ semantic metadata and insist of strict semantic consistency as part of the consolidation process.
It is valuable to clearly define the expectations for data warehouse consolidation using criteria such as these, and then make sure that the right processes and tools in place to comply with the defined success criteria.