Data integration methods

0

There are many ways to do data integration. Those include:

  1. Extract, transform and load (ETL) – which moves and transforms data (with some redundancy) from a source to a target. While ETL can be implemented (somewhat) in real time, it is usually executed at intervals (15 minutes, 30 minutes, 1 hour, 4 hours, 8 hours or possibly just once a day). This type of ETL is used for integration of multiple sources into operational data stores or a data warehouse.
  2. Logical data integration – requires software that will connect to multiple data stores. Rules on what attributes to get from which data stores will be required (there is nothing free in the world of data). I call this semantic metadata, and it actually says “Hey – I know how to get your data.” If you create one place to get data, and connect logically with software, the query or report will only be as fast as your slowest connection!
  3. Data integration as an event-based service – while this sounds really good, it is not so easy to implement. If your corporation has multiple software products that do the same function, integration will be very difficult with an event-based service. If the consumers of the data require A LOT of attributes, this may not be the way to go. An event can trigger when a field in a database changes (think database triggers!) and publishes the event for consumption by another process.

Data integration is the subject of many heated discussions in most organization. While newer technology sounds so fun to do, it may not make sense for every corporation. Truly, not every corporation is web-based retail – right?

So, consider these questions when talking about data integration:1439322506230

  1. What does the consumer really want? What are they doing with the data?
  2. Will this be sustainable over 2 years or 5 years?
  3. Does this create more work for the consumer (versus integrating within a data store)?
  4. Who is going to make changes to this when the data sources change? Or when we add another data source?

SAS is a leader in Gartner Magic Quadrant for data integration tools for the fifth consecutive year.

Share

About Author

Joyce Norris-Montanari

President of DBTech Solutions, Inc

Joyce Norris-Montanari, CBIP-CDMP, is president of DBTech Solutions, Inc. Joyce advises clients on all aspects of architectural integration, business intelligence and data management. Joyce advises clients about technology, including tools like ETL, profiling, database, quality and metadata. Joyce speaks frequently at data warehouse conferences and is a contributor to several trade publications. She co-authored Data Warehousing and E-Business (Wiley & Sons) with William H. Inmon and others. Joyce has managed and implemented data integrations, data warehouses and operational data stores in industries like education, pharmaceutical, restaurants, telecommunications, government, health care, financial, oil and gas, insurance, research and development and retail. She can be reached at jmontanari@earthlink.net.

Leave A Reply

Back to Top