In my last post, I pointed out that an uninformed approach to running queries on top of data stored in Hadoop HDFS may lead to unexpected performance degradation for reporting and analysis. The key issue had to do with JOINs in which all the records in one data set needed
Tag: big data
Financial institutions are mired with large pools of historic data across multiple line of businesses and systems. However, much of the recent data is being produced externally and is isolated from the decision making and operational banking processes. The limitations of existing banking systems combined with inward-looking and confined data practices
Small data is akin to algebra; big data is like calculus.
In the movie Big, a 12-year-old boy, after being embarrassed in front of an older girl he was trying to impress by being told he was too short for a carnival ride, puts a coin into an antique arcade fortune teller machine called Zoltar Speaks, makes a wish to be big,
Hadoop is increasingly being adopted as the go-to platform for large-scale data analytics. However, it is still not necessarily clear that Hadoop is always the optimal choice for traditional data warehousing for reporting and analysis, especially in its “out of the box” configuration. That is because Hadoop itself is not
When asked what his movement wanted around a century ago, the iconic American labor leader Samuel Gompers famously gave a one-word answer: "More." This annoyed his opponents at the negotiating table and many in the business community. He was not demanding a specific wage increase or fighting for a distinct cause like
In this blog series, I am exploring if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices. In Part 1, I provided a high-level definition of crowdsourcing and explained that while it can be applied to a wide range of projects
In this blog series, I am exploring if it’s wise to crowdsource data improvement, and if the power of the crowd can enable organizations to incorporate better enterprise data quality practices. In Part 1, I provided a high-level definition of crowdsourcing and explained that while it can be applied to a wide range of projects
Utilizing big data analytics is currently one of the most promising strategies for businesses to gain competitive advantage and ensure future growth. But as we saw with “small data analytics,” the success of “big data analytics” relies heavily on the quality of its source data. In fact, when combining “small” and “big” data
This isn't Kansas anymore. Oz has become a sprawling, smart metropolis filled with sensor data. How do we make sense of, clean, govern and glean value from this big data so we can get Dorothy home? The answer is SAS Data Management. With the latest portfolio updates, customers will be