.@philsimon asks some fundamental questions about taking the next step with #bigdata.
Tag: big data
.@philsimon on whether big data and analytics offer true guarantees.
A soccer fairy tale Imagine it's Soccer Saturday. You've got 10 kids and 10 loads of laundry – along with buried soccer jerseys – that you need to clean before the games begin. Oh, and you have two hours to do this. Fear not! You are a member of an advanced HOA
While it’s obvious that chickens hatch from eggs that were laid by other chickens, what’s less obvious is which came first – the chicken or the egg? This classic conundrum has long puzzled non-scientists and scientists alike. There are almost as many people on Team Chicken as there are on Team
.@philsimon on the specific risks to data quality posed by cloud computing.
Even the most casual observers of the IT space over the last few years are bound to have heard about Hadoop and the advantages it brings. Consider its ability to store data in virtually any format and process it in parallel. Hadoop distributors, such as Hortonworks, can also provide enterprise-level
At a recent TDWI conference, I was strolling the exhibition floor when I noticed an interesting phenomenon. A surprising percentage of the exhibiting vendors fell into one of two product categories. One group was selling cloud-based or hosted data warehousing and/or analytics services. The other group was selling data integration products. Of
I've been doing some investigation into Apache Spark, and I'm particularly intrigued by the concept of the resilient distributed dataset, or RDD. According to the Apache Spark website, an RDD is “a fault-tolerant collection of elements that can be operated on in parallel.” Two aspects of the RDD are particularly
Data quality has always been relative and variable, meaning data quality is relative to a particular business use and can vary by user. Data of sufficient quality for one business use may be insufficient for other business uses, and data considered good by one user may be considered bad by others.
As I explained in Part 1 of this series, creating a strategy for the data in an organization is not a straightforward task. Two of the most important issues you'll want to address in your data strategy are data quality and big data. Data quality There can be no data that is