Data Management
Bill Davis
MapReduce vs. Apache Spark vs. SQL: Your questions answered here and at #StrataHadoop

As the big data era continues to evolve, Hadoop remains the workhorse for distributed computing environments. MapReduce has been the dominant workload in Hadoop, but Spark -- due to its superior in-memory performance -- is seeing rapid acceptance and growing adoption. As the Hadoop ecosystem matures, users need the flexibility to use either traditional MapReduce

Data Management
David Loshin
Big data quality with continuations

I've been doing some investigation into Apache Spark, and I'm particularly intrigued by the concept of the resilient distributed dataset, or RDD. According to the Apache Spark website, an RDD is “a fault-tolerant collection of elements that can be operated on in parallel.” Two aspects of the RDD are particularly

Customer Intelligence
David Wallace
Everything old CAN be new again!

The first-annual SPARK! Financial Services Executive Summit used an unexpected approach to collaboration to generate unconventional ideas about the future of financial services. The ideas came from enthusiastic and engaged senior executives from across the financial services industry who are all committed to improving the industry’s image in the eyes