Data federation and data virtualization - still with us?


Data federation is software that allows an organization to create a "virtual database" from multiple sources of like information. For example, customer data may be stored in multiple applications in the enterprise. This software allows us to "cherry pick" the BEST parts of customer from each data source, integrate it and present to the business user. Much like a database view, this software allows us to create a layer of metadata that hides the complexity of connection and querying those application systems. Some software even allows you to physicalize the "virtual database."  Meaning storing it locally on disk.

Data federation makes sense to me when we want to create something very fast. It also works well for prototyping, but when the data is of bad quality or does not lend itself to integration, physical data stores will be required. Every article I have read does not talk about data quality in the same sentence as data federation or virtual databases. So beware of how it gets used in your organization. This software is not a silver bullet, BUT if you have good quality data that resides in multiple application systems, then use the software to create a "virtual single source" of customer or product data.

Creating integrated historical data, as a data warehouse, with this software may not be so easy. Some questions I would ask are:

  1. Where does the historical data come from? Source systems? I can’t very well make this up on the fly.
  2. If the historical data was in the source systems to begin with – why are you building a data warehouse?

About Author

Joyce Norris-Montanari

President of DBTech Solutions, Inc

Joyce Norris-Montanari, CBIP-CDMP, is president of DBTech Solutions, Inc. Joyce advises clients on all aspects of architectural integration, business intelligence and data management. Joyce advises clients about technology, including tools like ETL, profiling, database, quality and metadata. Joyce speaks frequently at data warehouse conferences and is a contributor to several trade publications. She co-authored Data Warehousing and E-Business (Wiley & Sons) with William H. Inmon and others. Joyce has managed and implemented data integrations, data warehouses and operational data stores in industries like education, pharmaceutical, restaurants, telecommunications, government, health care, financial, oil and gas, insurance, research and development and retail. She can be reached at

1 Comment

  1. I think you should take a look to recent Data Virtualization best practices. Nobody in the DV world is saying anymore you should access all data in real-time and that complex data quality transformations should be done on the fly from source data. But current DV tools can integrate all these steps and have much to offer in complex and hard BI scenarios. The recent Data Virtualization for Business Intelligence book by Ray Van der Lans describes a lot of patterns where DV tools complement existing BI infrastructure to increase agility

Leave A Reply

Back to Top