The Data Roundtable

Data Management

Matthew MagneAugust 18, 2015 0

What do metadata bridges, HAWQ and gender analysis have in common?

Now that another summer of 12-hour family road-trips to Maine and Ohio, pricey engineering and basketball camps for the kids, and beating the heat at the beach are over, I've taken a fresh look at what people are focused on with their data – and what SAS is providing in the data management space.

English

Data Management

Guido OswaldAugust 3, 2015 3

Data quality on Hadoop: The easy way

Bigger doesn’t always mean better. And that’s often the case with big data. Your data quality (DQ) problem – no denial, please – often only magnifies when you get bigger data sets. Having more unstructured data adds another level of complexity. The need for data quality on Hadoop is shown by user

English

Data Management

Jim HarrisApril 10, 2015 3

Hadoop is not Beetlejuice

In the 1988 film Beetlejuice, the title character, hilariously portrayed by Michael Keaton, is a bio exorcist (a ghost capable of scaring the living) hired by a recently deceased couple in an attempt to scare off the new owners of their house. Beetlejuice is summoned by saying his name three times. (Beetlejuice. Beetlejuice. Beetlejuice.) Nowadays

English

Data Management

Joyce Norris-MontanariMarch 31, 2015 2

Hadoop and big data management: How does it fit in the enterprise?

The other day, I was looking at an enterprise architecture diagram, and it actually showed a connection between the marketing database, the Hadoop server and the data warehouse. My response can be summed up in two ways. First, I was amazed! Second, I was very interested on how this customer uses

English

Data Management

Matthew MagneMarch 23, 2015 0

EMC and SAS redefine big data analytics with the data lake

Adoption of Hadoop, a low-cost open source platform used for processing and storing massive amounts of data, has exploded by almost 60 percent in the last two years alone according to Gartner. One primary use case for Hadoop is as a data lake – a vast store of raw, minimally processed data. But, in many ways, because

English

Data Management

Clark BradleyMarch 12, 2015 0

Provisioning data for advanced analytics in Hadoop

The data lake is a great place to take a swim, but is the water clean? My colleague, Matthew Magne, compared big data to the Fire Swamp from The Princess Bride, and it can seem that foreboding. The questions we need to ask are: How was the data transformed and

English

Data Management

David LoshinMarch 10, 2015 0

Using Hadoop: Emerging options for improved query performance

In my last two posts, we concluded two things. First, because of the need for broadcasting data across the internal network to enable the complete execution of a JOIN query in Hadoop, there is a potential for performance degradation for JOINs on top of files distributed using HDFS. Second, there are

English

Data Management

David LoshinMarch 3, 2015 1

Using Hadoop: Query optimization

In my last post, I pointed out that an uninformed approach to running queries on top of data stored in Hadoop HDFS may lead to unexpected performance degradation for reporting and analysis. The key issue had to do with JOINs in which all the records in one data set needed

English

Data Management

David LoshinFebruary 19, 2015 0

Using Hadoop: Impacts of data organization on access latency

Hadoop is increasingly being adopted as the go-to platform for large-scale data analytics. However, it is still not necessarily clear that Hadoop is always the optimal choice for traditional data warehousing for reporting and analysis, especially in its “out of the box” configuration. That is because Hadoop itself is not

English

Architecture diagram on how SAS works with Hadoop YARN

Bill DavisNovember 4, 2014 1

What do Hadoop superheroes do now that Hallows' Eve has come and gone?

Great works of fiction are filled with dynamic duos. Sherlock Holmes and Mr. Watson. Rosencrantz and Guildenstern. And, of course, superheroes like Batman and Robin. On Thursday, Nov. 5 at 1 p.m. ET, two real-world Hadoop superheroes – Arun C. Murthy, co-founder of Hortonworks, and Paul Kent, vice president of big

Blogs

Blogs

Tag: hadoop