Data federation is software that allows an organization to create a "virtual database" from multiple sources of like information. For example, customer data may be stored in multiple applications in the enterprise. This software allows us to "cherry pick" the BEST parts of customer from each data source, integrate it and present to the business user. Much like a database view, this software allows us to create a layer of metadata that hides the complexity of connection and querying those application systems. Some software even allows you to physicalize the "virtual database." Meaning storing it locally on disk.
Data federation makes sense to me when we want to create something very fast. It also works well for prototyping, but when the data is of bad quality or does not lend itself to integration, physical data stores will be required. Every article I have read does not talk about data quality in the same sentence as data federation or virtual databases. So beware of how it gets used in your organization. This software is not a silver bullet, BUT if you have good quality data that resides in multiple application systems, then use the software to create a "virtual single source" of customer or product data.
Creating integrated historical data, as a data warehouse, with this software may not be so easy. Some questions I would ask are:
- Where does the historical data come from? Source systems? I can’t very well make this up on the fly.
- If the historical data was in the source systems to begin with – why are you building a data warehouse?