To allow file uploads into your business intelligence environment or not to allow file uploads – what is your decision? I have a tendency to look at this issue like this:
- Someone on the business side needs this data – and probably needs to looking at existing data another way. For example, rolling up sales data to different sales regions or categories.
- The data may not "age" well for integration in the data warehouse. In other words, it may not be needed in two years, because our business changes so quickly.
- The load process requires automated data checking and validation.
Here's how to address these problems to make this a non-issue:
- Understand the needs of your business users, and change the data strategy to meet the business needs (that are always changing).
- Create a table in the staging area to load the file into, and consider this a place to validate and check the data for errors.
- Never integrate this data with subject areas like Customer or Product. But, instead create a data structure that allows you to "pull" data from Customer or Product, as well as the table you created for the uploaded file data.
- Consider profiling the data after it is loaded into the staging area, and generate any error reports. Make sure the data is OK before it is propagated any further in the data warehouse.
- Understand how and why the business users need this data. What if this data drives some important business for the company? Would you consider the upload then?