ETL and data warehouses are dead! Long live ETL and data warehouses!

A few years ago, I read an article that resonates with me to this day. (Of course, I can't find it today, but here's the gist of it.)

A new CIO (Walter) took the reins at a large multinational organization in February. (Call it Gray Matter Technologies or GMT.) He spent a few months getting the lay of the land. He met with GMT's VPs, department heads, and of course his fellow C-suite colleagues. In the process, Walter discovered that things were a mess. GMT ran on a patchwork of systems, technologies, reporting and ETL tools, and databases.

Read More »

Post a Comment

Hadoop and MDM – Who is the master?

Today, I was in a conversation about using Hadoop (a big data platform) for master data management (MDM). I still find it amazing when we have the discussion of what systems feed another system. Many of our friends have spent years creating MDM for customer, product, etc. with success. I'm a true believer that MDM should feed us all (operational data store, data warehouse, etc.).

So answer this: Why would you consider changing this to use Hadoop/big data as the entry point for MDM? Here are some reasons:

  1. Because big data can handle unstructured data, and we can use it for staging our data. It could be faster and cheaper for us.
  2. Our big data platforms can create structured data out of unstructured input. That could save time...right?

So, when asked whether to use Hadoop/big data for MDM – or not – the good consultant's answer is "It depends."

I believe that requirements will drive the design when gathered properly. I also believe that discovery should take place somewhere, and Hadoop/big data may be my answer.

But MDM, no matter where you put the data, still requires the following:

  • Data management disciplines (the three D’s):
    • Data quality.
    • Data integrity.
    • Data governance.
  • A framework that meets ever-changing enterprise needs.
  • Business personnel involvement in the MDM project.

That said, as we move into a new era of faster and cheaper ways of doing business, we need to be considerate of new technology. I like to think about how that technology fits into my vision of the current enterprise, and where it might go in the future. While it may not be the current “silver bullet,” it may very well end up being the platform that we drive home.

So, when a new project arrives at your doorstep, stop and ask yourself – Where does it fit? How will it be maintained? Is it sustainable? Does it need to be sustainable?


Download an e-book about the intersection of big data, data governance and MDM.

Post a Comment

Modernisation is the new normal

In my last post, I talked about how to observe the impact of modernisation through a data quality lens.

I asked you to consider the quality of your legacy data and what that means on the "shiny new toy" you intend to buy in the future.

In this post, I want to talk about how you can design and plan for modernization with some simple tweaks that won't bust your budget but will save you time (and sanity) when you go through another modernisation cycle.

Which will be sooner than you think. Read More »

Post a Comment

Four DI modernization mistakes

This month's theme concerns how organizations can modernize their data integration (DI) efforts. As many of the posts demonstrate, the era of big data portends massive opportunity – with the following caveat: An organization is much more likely to succeed if it takes a good, hard look at its current DI efforts. Against that backdrop, here are four DI/modernization mistakes to avoid.

Read More »

Post a Comment

Modernization and data-driven culture – Part 2

Modernization is a term used to describe the necessary evolution of information technologies that organizations rely on to remain competitive in today’s constantly changing business world. New technologies – many designed to better leverage big data – challenge existing data infrastructures and business models. This forces enterprises to modernize their approach to data management and analytics.

laptop represents modern data-driven cultureThe oft-cited secret behind modernization success is a data-driven culture. But nowadays, asking if your organization is data-driven is analogous to asking if your house has indoor plumbing. Since answering no would cause people to question how you do your business, so to speak, everyone claims their organization is data-driven.

In part 1 of this series, I explained that decision-making often reveals how data-driven a corporate culture really is – and I remarked that many organizations don’t use data to drive decisions but instead use data to justify decisions after the fact. Part 2 concludes the series by describing how organizations with a true data-driven culture make decisions. Read More »

Post a Comment

Data ops: Better way to prepare data for analytics and IoT?

We all find change easier when it starts with something we’re familiar with. That’s why I think sports analytics examples are popular – most of us are sports fans, so we get it more easily. It’s also why automotive examples that illustrate the potential reach of the Internet of Things (IoT) attract an enduring audience. Most of us are drivers and can relate to the benefits illustrated.

At Strata Hadoop in London recently, I had the pleasure of presenting SAS’ perspectives on intelligence for the connected vehicle. It was a lively session with the audience asking questions on a wide range of potential opportunities – from increasing safety, reducing risk and predictive maintenance to achieving loyalty and retention and real-time value adds like parking availability, charging stations and connected retail options.

While there were the usual questions about data sources, integration and algorithm design, there was also significant interest in data operations (data ops) – that is, the maintenance of data sources, preparation, quality and governance. Read More »

Post a Comment

A modernized approach to data lake management

water representing data lakeIn my last post, I started to look at the use of Hadoop in general and the data lake concept in particular as part of a plan for modernizing the data environment. There are surely benefits to the data lake, especially when it's deployed using a low-cost, scalable hardware platform. The significant issue we began to explore is this: the more prolific you become at loading data into the data lake, the greater the chance that entropy will overtake any attempt at proactive management.

Let's presume that you plan to migrate all corporate data to the data lake. And the idea of the data lake is to provide a resting place for raw data in its native format until it's needed. Now, let’s imagine what you need to know when you decide that the data truly is needed:

  • You need to know that the data exists.
  • You need to know that the data is in a file in the data lake.
  • You need to know which of the files contain the data you need.
  • You need to know the format, if any, of the data.
  • You need to know details about the storage layout.
  • You need to know what other information is in the file.
  • You need to know the security and protection characteristics for the data.
  • You might want to know who created the file and when it was added to the data lake.

In other words, you need to know a lot about that data. And here is the most confusing part: you may not even know which data is the data you want! That is part of the promise of the data lake – data is kept around until someone needs it, and it's up to the data consumer to determine what data they need, when they need it. Read More »

Post a Comment

Modernization and data-driven culture – Part 1

Modernization is a term used to describe the necessary evolution of information technologies that organizations rely on to remain competitive in today’s constantly changing business world. New technologies – many designed to better leverage big data – challenge existing data infrastructures and business models. This forces enterprises to modernize their approach to data management and analytics.

business people discussing data-driven organizationsThe oft-cited secret behind modernization success is a data-driven culture. But nowadays, asking if your organization is data-driven is analogous to asking if your house has indoor plumbing. Since answering no would cause people to question how you do your business, so to speak, everyone claims their organization is data-driven.

In this two-part blog series, I'll examine the characteristics of two types of data-driven cultures I've most often encountered in my consultation work. Read More »

Post a Comment

Modernization starts at the top

I often rail against the state of most organizations' data management practices. Four things inform my opinion:

Read More »

Post a Comment

MDM and Hadoop – Part 1

man evaluating MDM and HadoopHow many companies are using Hadoop as part of their master data management initiative? Come on, raise your hands! Well, maybe a better question is this: How many companies are using Hadoop for enterprise data?

From what I have seen, Hadoop is coming along quite nicely. However, it may not be the current technological “silver bullet.” I continually urge my clients to define the uses for Hadoop. I ask questions like this:

  • Is this project considered an enterprise-specific report that requires publication to external customers?
    • If it is, you may want to stay with structured, protected and guaranteed data.
  • Is this project analytical in nature and all the data is available in Hadoop?
    • Then Hadoop may be appropriate for this.
  • Should Hadoop be my store of data used for master data management?
    • That depends.

Read More »

Post a Comment