We need Hadoop to keep our data costs down

2

You’ve read the research reports and seen the statistics. You’ve attended the conferences and heard the case studies. You’ve read the online articles and kept up with expert opinions. Your organization has even done a few big data sandbox projects – some successful, some not. Yet the jury is still out: Can Hadoop really keep your data costs down? Some organizations, vendors and big data experts would invariably argue “Yes!” – and would have the stories (definitely) and the data (maybe) to support their position.

But what about you and your organization? The truth is, Hadoop may be able to keep your data costs down, but how are you going to determine that? You need something more than a gut feel and a wing and a prayer. You need something tangible. You need hardcore data – not research statistics or anecdotal stories about your competitors. You need numbers based on your organization’s requirements, resources and timelines.

You need the TCOD framework.

What is the TCOD framework? TCOD stands for "total cost of data," and the purpose for this framework is to help organizations estimate the total cost of a big data solution for an analytic data problem. It considers two major platforms for implementing big data analytics – the enterprise data warehouse and Hadoop – and helps you figure out which platform to use for each data problem.

In addition to the expected system costs for each platform, the TCOD framework also considers the cost of using the data over a period of time, typically five years. These usage costs include system and data administration, data integration, and the development of queries, procedural programs and analytic applications.

The TCOD framework was developed by Richard Winter and his team at WinterCorp, a consultancy focused on large scale data management challenges. WinterCorp introduced TCOD in a 2013 special report called Big Data – What Does It Really Cost?. To help illustrate the TCOD framework, the report includes two big data examples: a data refinery and a financial enterprise data warehouse. It steps you through the process of determining which platform – data warehouse or Hadoop – is more cost-effective over time with each example. The results may surprise you. Keep in mind that the results are just estimates (because a lot of assumptions have to be made), but these estimates trump anecdotal guesses any day.

To accompany the special report, WinterCorp also released a TCOD spreadsheet. It’s an extensive Excel workbook that is well-documented and ready to use. So if you’re ready to roll up your sleeves and do the hard work of figuring out what big data really costs, then the TCOD framework is waiting for you.

Download WinterCorp’s TCOD report and spreadsheet here:

Special Report: Big Data – What Does It Really Cost?

Spreadsheet: Big Data – What Does It Really Cost?

Tags hadoop
Share

About Author

Tamara Dull

Director of Emerging Technologies

I’m the Director of Emerging Technologies on the SAS Best Practices team, a thought leadership organization at SAS. While hot topics like smart homes and self-driving cars keep me giddy, my current focus is on the Internet of Things, blockchain, big data and privacy – the hype, the reality and the journey. I jumped on the technology fast track 30 years ago, starting with Digital Equipment Corporation. Yes, this was before the internet was born and the sci-fi of yesterday became the reality of today.

2 Comments

  1. And it's not just about keeping costs down. The 20x-50x economic advantage of Hadoop over traditional data warehouses means you can store 200TB of data at the same cost that it costs you to store 4TB of data on a traditional DWH. Just imagine the possibilities!!

  2. Pingback: Stop #9 in the Big Data Archipelago journey: the Investment Isle - Customer Analytics

Leave A Reply

Back to Top