Jim Goodnight on the secrets of big data computing

0

Schabenberger, Guard and Goodnight (l to r) with SAS Visual Analytics in the background.

When SAS CEO Jim Goodnight talks about the development of SAS High-Performance Analytics, he always starts with the customer. After all, it was banking customer UOB in Singapore that first approached Goodnight three years ago about reducing the time it took to calculate risk factors on the bank's full portfolio.

After that conversation, Goodnight came back to SAS headquarters in Cary, NC and started experimenting with risk calculations. The risk problem he was addressing for UOB was analyzing 20,000 risk factors for thousands of possible market states.

“Looking at how many computations had to be done, the rough estimate was about 200 trillion operations,” said Goodnight. Three years ago, chips were running at 2 billion computations per second, so Goodnight knew he wasn’t going to solve the problem on a single processor.

So he gathered 1000 computers and told each one of them to build 20 rows. “Everything we do in statistics is a row operation. That can be done by taking the row you want to operate on to all the other processors,” said Goodnight. “That’s the secret to how you do big data computing. You simply scatter it out over 1,000 machines.”

Goodnight told this story on stage in Las Vegas last week at Analytics 2012, where he also demo’d SAS products along with Oliver Schabenberger, lead architect for SAS High-Performance Analytics and Randy Guard, VP of Product Management.

“SAS has reinvented the way that we view data yet again,” said Goodnight before demonstrating some of newest developments that R&D is working on for big data and high-performance analytics.

The UOB story is important not only for showing the thought processes behind developing SAS High-Performance Analytics but also because it shows the customer-driven aspects to the development efforts.

During the presentation, it was clear that the same customer-driven philosophy continues today. Not only have the high-performance development efforts addressed problems that customers have brought to SAS, says Schabenberger, but the development has been done in such a way that customers can still work with SAS in the same ways they always have.

Instead of making customers learn new coding techniques to work with big data, SAS has re-engineered the high-performance products on the backend. “SAS Visual Analytics uses an entire rack of blades and operates on a billion records by allocating a million records to each process,” says Schabenerger, “But you interact with this large in-memory platform the same way that you would with Base SAS, and the results come back to you as if you had executed on your desktop.” And that's doing logistic regression on a billion records in-memory and in-parallel.

After re-engineering most SAS procedures for the high-performance environment, Schabenberger’s team is turning their attention to the next set of customer requests, including, “What if I don’t have a billion rows or 48 blades? Can you bring this down in size?”

To that, Schabenberger says, “We went big first. But we can also scale down.” He demo’d the scaled down version of the product that should be ready for release in January 2013. New features for that release will include:

  • The ability to partition data as you load it.
  • A new in-memory statistics procedure.
  • A way to “bookmark” output statements and pass them in-memory to another location.

Guard concluded the demo with real-world examples of SAS Visual Analtyics analyzing billions of records on the iPad. One example appropriate for the Vegas venue included a fictitious report of customers with drill-down capabilities to view high-rollers. This type of report could be used by a customer service advocate to review purchase histories and preferences for top customers.

Another example shows risk data, including a summary of all capital returns and a view of counter party exposure via a heat map.

Ultimately, Schabenberger says, “This is not an in-memory database. It’s a very well thought-out plan to deliver analytics as quickly as possible. It doesn't just to allow you to do things fast but to do things smart. And it lets you to attack problems you could not do before."

Learn more from SAS "big data" experts in this special 32-page report on high-performance analytics.

Share

About Author

Alison Bolen

Editor of Blogs and Social Content

+Alison Bolen is an editor at SAS, where she writes and edits content about analytics and emerging topics. Since starting at SAS in 1999, Alison has edited print publications, Web sites, e-newsletters, customer success stories and blogs. She has a bachelor’s degree in magazine journalism from Ohio University and a master’s degree in technical writing from North Carolina State University.

Leave A Reply

Back to Top