I am constantly faced with decisions on whether to do calculations in reports or store the calculated value in the data warehouse. So, I thought, why not share these thoughts with you?
Pros and cons:
- If I calculate a value in the report, I could be putting undue stress and complexity on my reporting software and hardware platform. I also have to configure the report to do this process.
- If I pre-calculate and store the value in the data warehouse, the ETL (extract, transform, load) must work a bit harder, and carry the burden of data validation and profiling with our data quality toolset.
- Changing the calculation in the report is probably easier than changing the ETL.
- ETL is meant to work hard with our processes and can usually capture metadata.
- ETL can interact with data quality processes and capture any data anomalies.
How I decide:
- If the calculated value is used by any other corporate analysis or reporting tool, then I need to pre-calculate and store in the table.
- If the calculated value requires input from the business user through prompts, then I usually calculate in the report.
No matter which way you choose, you must create metadata to capture this information for future use. I like pre-calculating in the table because I can capture that information in the data model. I also like making the ETL the workhorse as it interacts with my data quality software and captures all that metadata.
What is your preference?
1 Comment
My preference is to pre-calculate and store in the table. Primarily, to reduce any ambiguity if multiple versions of the calculation are created by multiple users in various reports. It provides more of a single source of truth, especially when stored with associated business metadata.