Stop #3 in the Big Data Archipelago journey: the Integration Isle

5

“If you build it, he will come.” – From the movie “Field of Dreams”

“Build it and they will come” is a popular quote often attributed to the movie Field of Dreams. But guess what? This quote is not from the movie; it’s actually a misquote. [See the actual quote above.] It’s fascinating how much mileage this misquote has gained over the years—in the media, at conferences, in our business meetings, and even in our social circles.

Truth be told, this quote—right or wrong—has fueled our organizations: Build the data warehouse and they will come. Build the customer data mart and they will come. Build the analytics solution, the self-service BI app, the data visualizations – and they will come. Even though we go to great lengths to expand our platforms, build the applications, and integrate the data and business processes, the reality is that they don’t always come. Cindi Howson, founder of BI Scorecard, has done the research and tells us the same story every year: BI Adoption Flat.

But now that we have big data and can build it into the mix, will they finally come?

Figure 1. The Integration Isle in the Big Data Archipelago

Figure 1. The Integration Isle in the Big Data Archipelago

A Big Data Best Practice for Integration

The best practice we like to share here is: Build it on demand using the best tools for the job. With big data and its technologies, we now have more options on what, where, and how we’re going to build our integrated infrastructure. While there are exceptions to every rule, big data and traditional, relational technologies are optimized for different purposes. The goal is to use these solutions for what they were designed to do—or in other words, use the best technology for each business requirement.

A Quick Example: Five Requirements

To illustrate this best practice, let’s take a look at these five common business requirements and identify which technology(ies) is better suited for each requirement:

BUSINESS REQUIREMENT

TRADITIONAL

BIG DATA

Discovery of unexplored business questions

++

++

Clean, transformed, high-quality aggregated data

++

+

Low latency, interactive reports, OLAP

++

+

High volumes of raw, highly granular, unstructured data

++

Exploratory analysis of preliminary data

++

  • Discovery of unexplored business questions. Both technologies are suitable options. Big data technologies, like Hadoop, excel in fast pattern recognition, making “discovery” work very fast in a big data environment. While data warehouses also do discovery work with relational databases and SQL – we’ve been using them this way for years! – sometimes this technology isn’t optimal for comparisons across and between large and often unstructured data sets.
  • Clean, transformed, high-quality aggregated data. In terms of data quality, many data warehouses already have integrated data quality functions built-in. The data used for analytics must be meaningful to business users. In big data environments, however, there could be a reason to provision data in its “raw” or unstructured format. It’s true that data quality can happen inside Hadoop—more and more vendors are offering up solutions—but the market is still young. For now, most companies prefer to make data quality a function of their analytics environment.
  • Low latency, interactive reports, OLAP. Data warehouses have typically been known as the answer for low-latency or interactive reporting. But with new data visualization tools, big data technologies are presenting an interesting new alternative for reporting directly against big data platforms.
  • High volumes of raw, highly granular, unstructured data. Big data technologies are primed to process raw, unstructured data. They not only process it quickly, but they can store it cheaply and avail it to a range of projects locally. Data warehouses, on the other hand, typically deal with transformed, aggregated, and structured data.
  • Exploratory analysis of preliminary data. This refers to data that might be in the midst of being processed, as in a staging area or sandbox. Or you might simply want to explore the data before loading it into another environment. Hadoop offers a standalone environment for structured and unstructured exploration, whereas with data warehouses, the data modeling, acquisition, cleansing, structuring, and loading of that data will require significantly more resources.

To quickly summarize, for some business requirements, it will make more sense to continue using your traditional data warehouse and analytical system. For others, using big data technologies may be the best and, possibly, only option. And sometimes either approach will work. Use the best solution for the job.

Key Takeaways for Marketers

  • If someone tries to sell you on the idea—“build it and they will come”—run. Or correct the quote.
  • Don’t build it because you can. Build it because it fulfills a business requirement.
  • Be open to using the best technology solution for the job at hand.
  • Tell the Wizard on the island to watch his back.
  • Partner with technical folks – internally and/or externally – who know big data technologies. This journey is not intended for “weekend warriors.”
  • The volcanoes on the island are generating some fireworks. They’re pretty to look at, but don’t get too close. Stay safe and have a happy 4th of July!

Finally, if you would like to explore how marketers can gain deeper insights and gain more value from big data, take a look at this HBR whitepaper, Customer Intelligence Tames the Big Data Challenge. It's a collection of viewpoints from HBR contributors on that very topic - and worth registering to get it.

This is the 3rd post in a 10-post series, “A marketer’s journey through the Big Data Archipelago.” This series explores 10 key best practices for big data and why marketers should care. Our next stop is the Open Source Adoption Isle, where we’ll talk about taking open source seriously for big data platforms.

Share

About Author

Tamara Dull

Director of Emerging Technologies

I’m the Director of Emerging Technologies on the SAS Best Practices team, a thought leadership organization at SAS. While hot topics like smart homes and self-driving cars keep me giddy, my current focus is on the Internet of Things, blockchain, big data and privacy – the hype, the reality and the journey. I jumped on the technology fast track 30 years ago, starting with Digital Equipment Corporation. Yes, this was before the internet was born and the sci-fi of yesterday became the reality of today.

5 Comments

  1. Great key takeaways! I like your point about using the best technology solution... This is worth remembering, and with caution, to ensure project plans and budgets are also adhered too. The map may appear good on paper but the journey may also unravel new territory. :-)

    Cheers,
    Michelle

    • Tamara Dull

      Thank you, Michelle! Even though it's easy to paint this journey as a fun Caribbean vacation, some folks may find themselves in the middle of a Lord of the Rings adventure, complete with Narwhals and underground volcanoes. Seriously, you're spot on with the upfront planning and ongoing management. "We'll just wing it" will yield more broken wings than badges of honor.

  2. Pingback: Stop #3 in the Big Data Archipelago Journey: The Integration Isle | The Cyberista Says

  3. "use the best technology for each business requirement."

    It's easier said than done. There is always going to be new tech to try. New software to upgrade to. But just because it's the latest doesn't mean it is the best for you. Too many companies spend too much money on software that doesn't actually help them do anything, simply because it's the hot new thing.

    • Tamara Dull

      Hi Pat, thank you for your insight. Big data and Hadoop, specifically, are certainly guilty of the "shiny new object" syndrome. One of my big data/technology battle cries is: Don't do "big data" just because you can or just to keep up with the Jones'. Tackle it only if you have a business problem to solve and it will increase revenues, improve the bottom line, and/or improve operational efficiencies. And yes, easier said than done. :)

Leave A Reply

Back to Top