Guest blogger Khari Villela says data lakes are not a cure-all – they're just one part of a comprehensive, strategic architecture.
In the extended enterprise, data integration challenges abound. David Loshin explains.
Why they will still play a valuable role in organizational data-management and -integration efforts.
The other day I was chatting with an ETL developer and he said he always pushes queries into the database instead of dragging data across the network. I thought “Hmm, I remember talking about those topics when I was a DBA.” I'd like to share those thoughts with you now.
There are many ways to do data integration. Those include: Extract, transform and load (ETL) – which moves and transforms data (with some redundancy) from a source to a target. While ETL can be implemented (somewhat) in real time, it is usually executed at intervals (15 minutes, 30 minutes, 1
While not on the same level of Rush, I do fancy myself a fan of The Who. I'm particularly fond of the band's 1973 epic, Quadrophenia. From the track "5:15": Inside outside, leave me alone Inside outside, nowhere is home Inside outside, where have I been? The inside-outside distinction is rather apropos
What data do you prepare to analysis? Where does that data come from in the enterprise? Hopefully, by answering these questions, we can understand what is required to supply data for an analytics process. Data preparation is the act of cleansing (or not) the data required to meet the business
The data lake is a great place to take a swim, but is the water clean? My colleague, Matthew Magne, compared big data to the Fire Swamp from The Princess Bride, and it can seem that foreboding. The questions we need to ask are: How was the data transformed and
Working out where Hadoop might fit alongside, or where it might replace components, of existing IT architectures is a question on the minds of every organization that is being drawn towards the promises of Hadoop. That is the main focus of this blog along with discussions of some of the reasons they
How many projects have you worked on that forgot to test size, volume, and conduct load balancing in a newly converted environment? I have worked on a few of those types of projects. I know in a data warehousing effort, we always check any servers and databases, based on load,
IT folks love SQL (Standard Query Language). Once you know how to program in SQL, you can work with almost any database because it is a standard. However, SQL is NOT a standard for doing analytics. The SAS programming language pre-dates SQL and even though SAS does SQL, SQL does not