Post a Comment
In a tiered approach to facilitating master data integration, the most critical step is paving a path for applications to seamlessly take advantage of the capabilities master data management (MDM) is intended to provide. Those capabilities include unique identification, access to the unified views of entities, the creation of new entity records, matching and duplicate analysis, and reviewing and managing cross-entity relationships.
Since I shared the details of these usage scenarios and core functions in last month’s series, I can focus this note on the thought processes driving the determination of which specific services are needed and where those services are invoked.
As we recall, the conventional approach to MDM has been to focus on data consolidation first, followed by the development and availability of a broad array of (generally nondescript) access and update services (succinctly, “get” and “put” routines). There is a need for these types of services, but from the application’s perspective, accessing a master record and potentially updating one or more attributes is not only too granular, it creates a significant burden to any application owner with a desire to use the master data repositories. Read More
Post a Comment
In a previous blog, I wrote about the top ten fallacies of why data governance is perceived to be too burdensome and costly. Hopefully I dispelled the preconception that data governance is slower and less nimble than the today's informal data management practices. In this post, we'll examine the concept of governance as an “invisible hand.”
The invisible hand of the market is a metaphor conceived by Adam Smith to describe the self-regulating behavior of the marketplace through the collective action of individuals to maximize their individual gains. In a similar way, well-designed data governance can serve as an invisible hand on data sourcing, quality and delivery. Many constituents serve to gain from better documentation and data standards. The key difference here is that transparency – e.g., that ability of all end-users to understand the process and gain accountability for data decisions – should actually facilitate the program’s invisibility. So the balance of transparency with invisibility should be the end-result of efficient data governance.
Let’s think about good civic governance in day-to-day life – traffic lights, stop signs, snow removal and public safety. Most of us take these services for granted and assume they will happen as needed or on a regularly scheduled basis. Without commotion the city council will debate and resolve issues put on the agenda. Policy is developed that most of us know nothing about – nor do we want to. Read More
Post a Comment
In 1964, when the American radio astronomers Arno Penzias and Robert Wilson were setting up a new radio telescope at AT&T Bell Labs, they decided to point it towards deep space where they expected a silent signal that could be used to calibrate their equipment. Instead of silence, however, what they heard was a persistent noise, a seemingly meaningless background static that they initially mistook as an indication their telescope was faulty equipment in need of repair.
For almost a year, they functioned off this assumption. At one point, they pondered if the cause of the static might be the excessive amount of pigeon poop accumulating on their telescope. But even after spending a month meticulously cleaning it, when they pointed the telescope towards deep space, once again they heard the same persistent noise. (At which point, although it is not included in the official scientific record, I like to imagine that much stronger language than “poop” was uttered.)
However, after analyzing what they initially thought was the crappiest possible data produced by a broken telescope, they challenged their own assumptions. By doing so, they discovered what was data of the highest possible quality. It revealed, in a classic example of mistaking signal for noise, one of the greatest scientific breakthroughs of twentieth-century physics.
Arno Penzias and Robert Wilson won the 1978 Nobel Prize in Physics for discovering what’s now known as cosmic microwave background radiation. In other words, in the big data raining down from Big Sky, they managed to hear the remnants of the Big Bang. Penzias and Wilson helped the Big Bang Theory defeat its primary rival, the Steady State Theory, as the prevailing scientific model of the universe. Read More
Post a Comment
In my last series of posts, we looked at one of the most common issues with master data management (MDM) implementation, namely integrating existing applications with a newly-populated master data repository. We examined some common use cases for master data and speculated about the key performance dimensions relevant to those use cases, such as the volume of master data transactions, the need to maintain a respectable response time and synchronization challenges for ensuring data currency.
Our conclusion was that we could develop a services model that deployed the different types of common master data functionality such as entity search, retrieval of an entity’s master record or the management of entity relationships. There is still a need to develop the services layers that comprise the operating model, and this means envisioning the architectural decisions to be made that can ensure that the levels of MDM performance satisfy the expectations from the community of master data consumers.
The reason this can be somewhat challenging, especially in a large enterprise, is that the de facto architecture of the environment may pose a variety of constraints that prevent the compliance with the expected levels of performance. For example, a customer master repository may have been instantiated on a standalone server with limited network accessibility. At some point, an online application may want to directly search the customer master to identify known customers and access their behavior profiles. This may imply thousands (if not orders of magnitude more) of simultaneous users banging up against the master index and repository in a way that can’t be handled by the standalone server. Read More
Post a Comment
One of the biggest impediments to (and failures of) a new data governance program is the perceived level of “extras” required. Let’s enumerate some of the concerns that I hear consistently from our clients:
- Extra people will be required to staff the implementation.
- Extra budget money will be needed to fund the project.
- Extra time will slow critical projects.
- Additional documentation will be burdensome.
- Too many people need to be involved in decision making.
- Business ROI cannot be justified.
- There will be additional levels of hierarchy and bureaucracy.
- Consensus will not achievable.
- Accountability will involve blame.
- Extra work will be required for everyone involved.
When designing a data governance program, it is critical to anticipate these concerns, as they may reflect past experience in failed governance attempts. Unless these issues are addressed, another "extra" activity will include ongoing efforts to convince participants and end users of the true benefits of a well-executed data governance program.
First let’s examine where these concerns arise. Any discussion of new processes or standards will automatically raise concerns about additional work. But let’s consider some of the main drivers for data governance: Read More
Post a Comment
One of the big problems with data migration projects is that, to the outside sponsor, they appear very much like a black box.
You may be told that lots of activities and hustle are taking place, but there isn’t a great deal to show for it until the final make-or-break migration execution when the data finally moves across to the target environment.
The problem is made worse to an extent by the fact that, as data migration practitioners, we openly demand that data migration initiatives are delivered as a separate team, budget and working environment. This often cuts us off slightly from the main system implementation project.
Sponsors need to see tangible results, particularly on large-scale migrations that can take many months or even years. They need to report to their peers and seniors about the state of the migration and how it will (or won’t) be dovetailing into the target implementation.
So how can you show progress when you can’t physically migrate the data until the final go-live migration date?
One way is to change the actual strategy of your migration and opt for a more agile, iterative approach. Instead of migrating data across in one single 365-day drop, create multiple drops of, say, 90 days where you are moving tangible portions of data. This could be by region, data subject area, customer type – anything that is relevant to the migration. Another tactic I’ve used in the past is to just migrate the "backbone of the data," the primary and foreign keys, for example. The reason for this is that if you hit problems with these, then you find out much earlier in the lifecycle. Read More
Post a Comment
"Technology is neither good nor bad; nor is it neutral."
The quote above is my favorite one of Kranzberg's six laws of technology. The law applies to everything from typewriters to tablets. Think of it as Moore's Law sans limits. I doubt that Kranzberg was a heavy-metal fan, but his words ring true for us data-loving headbangers.
Consider how British metal legends Iron Maiden responded after seeing a spike in illegal bit torrents in South America. Maiden took action – but not the legal kind. From a recent BoingBoing article:
Rather than send in the lawyers, Maiden sent itself in. The band has focused extensively on South American tours in recent years, one of which was filmed for the documentary "Flight 666." After all, fans can't download a concert or t-shirts. The result was massive sellouts. The São Paolo show alone grossed £1.58 million (US$2.58 million) alone.
Note that the public perception of fighting fans is fraught with peril. Just ask Metallica's Lars Ulrich.
For more than 15 years now, most intelligent folks have understood the futility of attempting to prevent illegal music downloads. Record-industry lawsuits may have engendered the demise of Napster, but it has done very little to curb pervasive MP3 sharing. Current estimates put the illegal to legal download ratio at 25:1.
It's a tough time to be an emerging musician, but that's neither here nor there. Maiden is a one of the world's most iconic bands. Its mascot Eddie (pictured above) is far more recognizable among his contemporaries on many U.S. sports teams. Judging by its decision to listen to new sources of data, it also is a very intelligently run band. Read More
Post a Comment
In previous posts, I pondered the evolution of problem solving that is being data-driven by our increasing reliance on algorithms, which some mistrust as a signal that we’re shifting from human to artificial intelligence (AI).
Would you like to play a game?
“Slowly but surely,” John MacCormick explained in his book Nine Algorithms that Changed the Future, “AI has been chipping away at the collection of thought processes that might be defined as uniquely human. For years many believe that the intuition and insight of human chess champions would beat any computer program, which must necessarily rely on a deterministic set of rules rather than intuition. Yet this apparent stumbling block for AI was convincingly eradicated in 1997, when IBM’s Deep Blue computer beat world champion Gary Kasparov.”
Another AI stumbling block was hurdled in 2011, when IBM’s Watson won Jeopardy! by defeating Brad Rutter and Ken Jennings, two of the television quiz show’s most celebrated champions. “I for one welcome our new computer overlords,” Jennings joked afterward. In his TED talk video Watson, Jeopardy and me, the obsolete know-it-all, Jennings talked about how it felt to have a computer literally beat him at his own game, and also made his case for the value of good old-fashioned human knowledge.
“Meanwhile,” MacCormick continued, “the success stories of AI were gradually creeping into the lives of ordinary people too. Automated telephone systems, servicing customers through speech recognition, became the norm. Computer-controlled opponents in video games began to exhibit human-like strategies, even including personality traits and foibles. Online services such as Amazon and Netflix began to recommend items based on automatically inferred individual preferences, often with surprisingly pleasing results.” Read More
Post a Comment
Over the past few posts we've looked at developing an integration strategy to enable the rapid alignment of candidate business processes with the services provided by a master data environment. As part of a preparatory step, it is valuable to at the very least understand the implementation requirements to meet the business needs. The next level of preparation would be to have design templates for the each of the necessary services associated with the underlying system components to ensure the levels of service and performance can be met.
For example, presume an online process that needs to perform an identity search each time a customer attempts to create a new account. The performance criteria are defined in terms of data volume (how much data needs to be shared for the identity searches), simultaneous load (how many search requests at the same time) and response time (how fast must the system provide an answer). Having templates for implementation that can be tuned to meet the desired performance will speed integration with the MDM environment.
You can anticipate those implementation details by evaluating the potential development needs, designing the templates for implementation and even implementing the various services on platforms that can meet a reasonable combination of requirements. Provide a matrix that maps each of the master data service usage scenarios to the desired performance characteristics. An example is shown in table 1. Read More