Ye Olde information overload

There’s no such thing as information overload - there is only filter failure”.  ~ Internet scholar Clay Shirky

Information overload is not just a recent phenomenon, it entered into human experience in the middle of the 15th century with Gutenberg and his printing press, and we’ve been devising ways to cope ever since.  Today, more books are printed in a month than can be read in a lifetime.

filter 2And that’s just books – every day we create approximately 3 exabytes of data (that’s 3 million terabytes for those of you keeping score using last year’s counting system).  Every second, 3 million emails are sent, 50,000 Tweets are tweeted, and 2 hours of cat videos are uploaded. Every.  Single.  Second.  (in the time it took to read this far, you just missed 3,000 cat videos)  If you’ve only got 100 unread emails you’re still an amateur.

So how do we cope?  We do what nature does – we filter.

Read More »

Post a Comment

Good habits for big data

Begin with the end in mind” - Habit #2 from Stephen Covey’s ‘Highly Effective People’.

thisideamustdie_brockmanThe Edge Foundation is based on the premise of: “To arrive at the edge of the world's knowledge, seek out the most complex and sophisticated minds, put them in a room together, and have them ask each other the questions they are asking themselves.”  Each year they pose to their illustrious contributors their annual “Edge Question”, after which John Brockman, editor and publisher of Edge, gathers together the various responses and publishes them in book form.  The question for 2014 was “What scientific idea is ready for retirement?”, with the replies recently published as “This Idea must Die”.

The nomination from Gary Marcus, cognitive scientist at NYU, for an idea whose time has come was ‘"big data" (Already? We hardly got to know you, big data).  Marcus' argument wasn’t that big data has become unnecessary, but that it’s quickly become a case of putting the cart before the horse.  Data has its place, but that place is AFTER you have formulated a hypothesis or theory about a problem you are trying to address.   With a theory in place, you then devise an experiment to test that hypothesis.  The most important property of the data at this stage is that it be relevant to the problem / experiment at hand, and if so, then the more the merrier.  But if not, well, as Marcus puts it, “Big data should not be the first port of call; it should be where we go once we know what we’re looking for”.

This ‘starting with the end in mind’ approach serves to highlight the key factor that will drive the effectiveness of your data scientists and business analysts – the right data management tools.  You want your data scientists and analysts spending the bulk of their time and effort on developing and working out the details in those hypotheses, theories and models, not in data collection and preparation.  Analytics uses data in a format quite different from that found in the EDW, where storage efficiency is paramount.  The right data management tools can cut the proportion of time spent on data prep in half or more from the 80% typically seen, freeing your valuable analytic resources to focus on solving the important business problems.

I have my own related anecdotal evidence on this subject, which comes from my experiences as conference chair, a story I call, “The Big Honkin’ Data Cube”.

As a conference chair, one of your primary concerns is keeping to the schedule.  If each speaker has 30 minutes, and you’ve allotted the last five minutes for Q&A, you would expect them to be getting around to their ‘summary / key takeaways’ slide by around the 22 minute mark, which means the crucial “So What” moment should occur by about 16-18 minutes into the presentation.

There is, however, one category of presentation that puts me into panic mode at least once a conference, the presentation where the user or vendor talks about how they implemented their data warehouse.  I keep waiting, and waiting, and waiting for the punchline, when with just four minutes left for Q&A they announce that they are finished and ready for questions.

As I look out over the audience, I see two different sets of facial expressions.  Half of the audience, the IT segment, has that look of satisfaction - they got their roadmap and tips & tricks and lessons learned and it was time well spent.  The other set of faces, the business users, has the look of, “What – did something just happen?  What did we miss?”

It took me the longest time to figure this phenomenon out.  The conclusion I came to was that the primary focus of the IT-oriented presenter was on the construction and implementation of the data cube itself, and for them the job was complete at that point - any related business case for the EDW was taken for granted and left unstated.  The business users, however, were left at the alter waiting in vain for the presenter to connect the dots to the business issue(s) this data monster was built to address; their expectation for the "So What" moment being more along the lines of: What is this data warehouse being used for, and who are the intended users?  Marketing, HR, quality, R&D, forecasting and planning?

The two audience segments approach the business arena from opposite ends, with one working from the data towards the business problem, and the other from the business problem backwards to the data.  I find myself, along with Stephen Covey, in that latter group, who are typically the ones to initiate the Q&A with a question about the primary business case(s) attached to the EDW implementation.

I don’t want to leave the conversation here, though, as, this conclusion entirely contradicts my contention last week in “Big Variety” that the value in big data lies in the connections, correlations, networking, explorations and insights that can be gleaned from both its variety and its bigness.   Last week’s assertion was that big data / Big Variety / the EDW is most definitively the place to start to discover your Unknown-Unknowns.

In the end I think there is value in approaching that Big Honkin’ Data Cube from both directions:  Getting the right data to tell the story / address the business problem, but also in interrogating that data for the many interesting stories hidden within.   When it comes to big data, good habits can start at either end.

Post a Comment

Big Variety: The real value in Big Data

Forget about Big Volume, for my money the real value in Big Data comes from its variety.  Why? Because just as there is “Value in the Network” when it comes to your business ecosystem, your data can be "networked" for value in much the same way.

Before we dive into the business implications of Big Variety, consider this case from the natural sciences – the discovery, development and eventual acceptance of plate tectonics.   First proposed as the theory of Continental Drift by Alfred Wegener in 1912, it was not until the 1960’s that it was fully accepted based on the overwhelming data-driven evidence acquired across a wide variety of fields:

  • FossilsGeography – As early as 1596 the Dutch cartographer Abraham Ortelius noted the remarkable fit of the South American and African continents, and even suggested that “the Americas were torn away from Europe and Africa . . . by earthquakes and floods”.
  • Geology – In direct support of his theory, Wegener remarked that the locations of several unusual geologic structures could be found on matching coastlines of South America and Africa.
  • Paleontology - Snider-Pellegrini noted that the locations of certain fossil plants and animals on present-day, widely separated continents would form definite patterns (shown by the bands of colors on the above map). Wegener further highlighted the discovery of fossils of tropical plants (in the form of coal deposits) in Antarctica led to the conclusion that this frozen land previously must have been situated closer to the equator. Other mismatches of geology and climate included the occurrence of glacial deposits in present-day arid Africa.
  • Bathygraphy – Not only did post-WW II sonar reveal the extent of the Mid-Atlantic ridge circling the planet under the ocean surface, detailed analysis of the data indicated a narrow gorge precisely along the center crest of the mountain range where the plates were separating.
  • Oceanography – Seafloor mapping using magnetic instruments revealed a pattern of alternating magnetic striping on either side of the mid-ocean ridge.
  • Seismology - Earthquake-recording instruments enabled scientists to learn that earthquakes tend to be concentrated in along the oceanic trenches and spreading ridges.

It took putting all this data from these disparate fields together to develop a plausible mechanism for what came to be known as plate tectonics and thus finally vindicating Wegener.  Big Variety preceded Big Volume by half a century.

Getting back to the application of Big Variety to the business environment, let’s start with what’s commonly called the networking problem.  Connecting two people requires one connection.  Connecting three people require three connections, four people requires six, five requires ten, twelve requires 66, 100 requires 4950, and so on.  As you would expect, there is a formula for this, the Triangular Number formula:  (N(N-1))/2.

VA1 correlation matrixOne more formula:  N!/(K!(N-K)!).  Combinations.  How many different combinations of three can I make from a set of five?  That would be ten, where N=5 and K=3 in the above formula.  From a set of ten the number of combinations of three jumps to 120.  Combinations of four, instead of just three, from that same set of ten is nearly double that: 210.  And if I have a set of 100 to choose from instead of just ten, then my combinations of four becomes an incredible 3,921,225!  (whew, that got out of hand fast …)

Is this a problem?  Or an opportunity?

Sorry – rhetorical question – IT’S AN OPPORTUNITY!  It’s an opportunity for insight, unparalleled insight into your business.

Let me walk you through a generalized example of exactly how easily and quickly SAS Visual Analytics allows you to draw insights from your Big Variety.  The place to start would be with a Correlation Matrix (above), a pairwise evaluation of attributes which displays the degree of correlation between measures as a series of colored rectangles. The color of each rectangle indicates the strength of the correlation, with the dark blue in this example representing strong correlation.

This simple matrix where N=6 and K=2 has 15 pairwise combinations – a typical functional or departmental analysis of your business, say for customer service or production or logistics, might have 20 elements for analysis, yielding 190 pairwise combinations, covering maybe a decade’s worth of data and millions or even hundreds of millions of rows, which would be processed by SAS Visual Analytics in just a few seconds.

VA2 BubbleYou then might grab the two pairwise elements highlighted in that dark blue box in the bottom right, say perhaps 'failure rates' and 'time since last maintenance', throw in 'facility', 'equipment type' and 'year', and take a look at what a bubble plot might reveal, where the axes, color and size all represent different attributes, and which can be set in motion over time.

VA3 BoxLastly, once you’ve narrowed down your suspect list to perhaps three categories of high failure equipment, a box plot might be useful to compare typical failure rates, with the box representing the 25th to the 75th percentiles, and the whisker bars showing the extreme outliers.

If the data is available, that entire process would take only a couple of minutes.  And these are just three illustrative examples of hundreds of exploratory techniques and analytical tools available in SAS Visual Analytics.  Rinse and repeat for thousands of possible insights for action across your organization:  Marketing, Sales, Customer Service, Quality, Distribution, Human Resources, Innovation, Procurement, and of course, Operations.

The value from data integration and networking multiplies exponentially with each additional data source.  Big Variety is the future of Big Data.

Post a Comment

Relationship status - Connected; Analytics for Agency

“When it comes to the Internet of Things, the future clearly belongs to the Things”. I made this brash statement in a previous post (“Cloud encounters of the Fifth Kind”) referring to machine-to-machine (M2M) being the fastest growing component of non-human traffic on the Web. I say “brash” because that sweeping generalization overlooked one other factor – the human factor.

always-connectedbFor those on the technology / data / IT / manufacturing / device side of the IoT, the conversation is in fact typically about the THINGS.  My colleagues from the services, financial, media, telecom, retail and health care sectors, however, largely couldn’t care less about such THINGS - what’s important to them is the Customer.  From their perspective it’s not primarily about smart devices or connected cars – it’s the connected customer/consumer that matters.

What does the connected consumer want out of being connected?  A.T. Kearney identifies four motivations in this study, “Connected Consumers are Not Created Equal”:

  • Interpersonal connection (i.e. Facebook, texting)
  • Self-Expression (i.e. this blog, Tweets)
  • Exploration (i.e. curiosity, education, researching purchase options)
  • Convenience (i.e. email, weather updates, ecommerce)

It’s primarily through that last one, convenience, that the internet of devices and things intersects with both the internet of the human consumer and with the business models for providing value over the IoT.  Breaking convenience down a bit further, based on this infographic from AdWeek (“Why the 'Internet of Things' Hasn't Really Caught On Yet”), we find these functions to be in the highest demand:

  • Remote access (i.e. Warm up my car, my house, my coffee)
  • Predictive analytic ability (You will run out of cereal in 13.7 hours – better pick up some on the way home)
  • ‘Push’ notifications   (i.e. Don’t forget tonight’s game, and don’t forget to pick up the kids first)
  • Data aggregation and analysis (No need to turn on the evening news - here’s your top five stories for the day)
  • Personalized recommendations (i.e. Put that shirt right back where you found it - this one goes much better with what you’ve got in your closet.)

Take these five functions to the next level and you get “agency” – your device attaining real smartness by learning your preferences and behaviors from past history and then taking action on your behalf without your direct involvement.

Getting to agency will require some powerful analytics operating behind the scenes, such as:

  • Machine Learning: The algorithms and automation behind the artificial intelligence that drives the analytic models that learn from data in an iterative fashion, and then used to produce reliable, repeatable decisions.
  • Text mining and sentiment analysis: A combination of statistical modeling and rule-based natural language processing techniques that show patterns, detailed reactions and extract sentiments from a variety of text-based sources.
  • Event Stream Processing, Decision Management and the Adaptive Customer Experience: After you’ve learned the behaviors and done the analysis, there are a host of analytic tools available to execute on agency

The potential benefits from agency are limitless, but agency is also very, very scary.  From that same AdWeek infographic, the number one fear consumers have regarding the IoT is privacy and security.  You can’t get effective agency without these devices, and the ominous, anonymous servers behind them, amassing a huge database on you, and analyzing that data often to the point where the device seems to know more about you than you do yourself.

Trust.  Getting to agency requires trust.  Do you remember the first time you entered your credit card number online?  Of fed a cash deposit into an ATM?  I’ve been a victim of identity theft, and will therefore always jealously guard my Social Security number, often to the point of turning down otherwise beneficial transactions (it was a limited breach, but it did take about eight months and the services of a lawyer to sort out).

Agency is where this is inevitably headed, though.  My recommendation is that when your devices are smart enough for agency that you engage the connected customer in tiers.  Don’t make agency an all-or-nothing feature of your offering.  Let the more cautious, less tech savvy customer opt-in to agency at a relatively nonthreatening level.  Next, provide a middle tier where you make lots of recommendations and decisions on behalf of the consumer but give them plenty of opportunity to examine, concur, approve or override.  These two lower levels of agency will require that you provide the customer with a great deal of transparency – transparency into the process as well as the outcome.   Lastly, as trust is built up, agency can be given a freer rein in a more automated tier.

If you are interested in keeping abreast of developments in this arena, I recommended this website, “Center for the Connected Consumer” at George Washington University.  And a good place to get an introduction to the importance of agency in the long term direction of the IoT would be this short video by co-director Donna Hoffman – “Marketing on the Internet of Things”.

When you get to the level of agency, your customer is essentially in a relationship with their smart devices, and all good relationships are built on trust.  So whether you are designing smart devices or writing smart apps, keep that connected customer in mind, as well as the trust you are going to have to bake into the system before your smart device has earned the right to say to that customer: "You are NOT leaving the house dressed like that."

Post a Comment

Charles (Dickens and Darwin) and continuous improvement

Whales"You show me a successful complex system, and I will show you a system that has evolved through trial and error."  ~ Tim Harford

TED Talk link:



Karl Marx died thinking that the first communist revolution would occur in Great Britain, driven by the long hours and unsafe / unhealthy conditions in the factories, and the rampant urban squalor and poverty so memorably illustrated by Charles Dickens.  Pre-industrial, agrarian, peasant Russia would never have even made his list of potential candidates.

With hindsight and a more robust economic theory to guide us, it seems pretty clear now that pre-War England was economically complex beyond the point of no return for anyone to have seriously entertained installing a centrally-controlled economy, and probably had been so for more than a century, ever since Robert Walpole, Great Britain’s first Prime Minister, single-handedly invented the modern state financial system.

On the other hand, his megalomanic psychopathology aside, it’s easy to see how Stalin, with no history of having lived or worked within a developed, industrial economy, could have imagined it entirely possible to centrally control his newly/barely industrialized, still largely agrarian post-War economy, hence his succession of failed five-year plans.

The same can likely be said for the large, modern corporation – that it too is largely past the point of no return when it comes to centralized control.  I made a point in this previous post (“Metrics for the Subconscious Organization”) that your business “functions day-after-day, minute-by-minute, without your active control or even your conscious knowledge – this is an organization that has long since learned what to do and pretty much runs itself.”

How does a large commercial organization manage to coordinate itself so well?  The market gets by with a single mechanism, price, whereas the larger society within which markets operate has a more complex set of values (i.e. safety, health, education, civil rights, etc …) and thus requires a broader toolkit, including laws, regulations and policy.  In this sense, a large commercial business is more like a society than a market, coordinating itself with a wide array of policies, incentives, metrics, strategic objectives, values, mission statements, stories and leadership.

That’s all well and good for day-to-day operations, staying on an even keel, maintaining stasis.  But what about when you want to bring about change to your organization?  In this post (“Changing corporate culture is like losing weight”), I addressed the big hurdle encountered when attempting to make big changes – “The feedback loop, the thermostat that exists in every organization to maintain normalcy and stasis against a changing environment.  You’re trying to enact change against organizational processes that have evolved to specifically minimize change.”

That post discussed making big changes, where, against the tide of homeostasis you push hard and go long and hope that the initial result lands somewhere close to your goal.  But what about small changes and continuous improvement?  How can you hope to get incremental change to stick when all your basic organizational processes are programmed to resist and expel the invading virus of change?

Through the process of evolution.

Evolution through natural selection has two required elements.  The first is variation, a diverse assortment of characteristics and processes that can be acted upon by selective pressure.  That’s your INNOVATION, which I leave to you (and why innovation is so important for all organizations – whether or not you see yourself specifically as a product innovator, you still need internal process innovation regardless, or you will go extinct).  The second element is environmental pressure, something to do the selecting among the variations/innovations, something that rewards fitness.

These environmental factors are nothing more than my list of levers for losing cultural weight, but employed now in the service of continuous improvement:

  • Organizational structure and design
  • Rewards, incentives, recognition and performance management / metrics
  • Tools, resources, systems, data and processes
  • Hiring / selection / training / orientation
  • Leadership / stories / heroes / values / communication

In order to support continuous improvement, the idea would be to internally develop not just a one-time set of levers to be utilized against a single, big strategic objective, but to establish something like an Office of Environmental Pressure with the objective of identifying the targets, levers and incentives across the organization that you want subject to continuous improvement. Likely targets might be:

For just this once I won’t excoriate you for navel gazing and lack of external data and benchmarks, because the data you need to support continuous improvement is already in-house. You are of course going to want to run some analytics against that all that data, explore it, visually, to see where the insights and connections and correlations are.

  • What functions or processes are ripe for improvement? (an activity-based approach wouldn’t hurt here)
  • What factors are correlated with Quality (or time-to-market, or cycle time, or service level, etc …)?
  • Instead of measuring the same thing three different ways, what’s the single best metric?
  • Which levers are best associated with which behaviors?
  • Which incentives / rewards are most effective?
  • As with your homeostatic Subconscious Metrics, which metrics are the ones everyone (by function / role) should be monitoring for Continuous Improvement?

As a nation’s leadership and policy comes from the top, so too does corporate strategy and vision.  But just as Stalin’s Central Party Committee could not will tractors into existence against the reality of the market, neither can the office of the CEO micro-manage the continuous improvement of their organization.  If instead they approach the challenge by directing the evolution of the business via selective pressure in the desired direction, progress can be made.

The most important point Tim Harford makes in his TED talk (above) is about the success of a complex system.  Just as most mutations are detrimental to an organism, a heavy-handed, top-down approach to change is more likely to cause damage than improvement.  The law of unintended consequences: fix one problem only to have that fix create three more.  Incent one group to improve their performance and they’ll do it at the cost of overall organizational efficiency and fitness.  But by allowing the organization to steadily evolve under pressure, it can work out for itself the myriad of interconnected kinks and links among processes and functions, and emerge holistically more fit to compete and perform than before.

Post a Comment

Creating value on the IoT – It ain’t about you

The Internet of Things is going to be driven by innovative business models as much as by innovative technology.  In order to ground the following discussion, I found it helpful to create this visual depiction of the IoT that defines and distinguishes the key elements that enter into these business models.  My simplified definition includes these six elements:

  1. The network backbone
  2. A server
  3. Smart devices, which I define as configurable, IP addressable devices permitting two-way communication
  4. Sensors, which although IP addressable, are not significantly configurable and allow for only one-way traffic back to the server
  5. That data generated by these elements which travels over the network
  6. Third party / cloud connections to the network; in other words - everything else


Business models involve both value creation and value extraction, and it is important to at least recognize that there will inevitably exist a category of “rent seeking” business models that create no reciprocal value.  These are largely the infrastructure components whose real value is primarily defined by ‘capacity’, such as network hubs / platforms, network pipe and switches, and the Last Mile, where they all seek to extract value from the IoT by virtue of their position as chokepoints.   While these may initially pass as viable business models, I expect most to eventually succumb to market and regulatory forces.

iot_analytics2Having gotten that unpleasantness out of the way, let’s turn our attention to the business models that create value via their “Things”.   The fundamental case that kicks everything off is of course that of providing and owning the server, a device, and the data generated between them; a straight forward, one-to-one relationship.  After that, everything else flows through the Third Party / Cloud component:

  • How do I add value to the device, or to the server?
  • How do I add value to the data (i.e. Analytics)?
  • Can I connect additional devices that add value to the existing device / server?
  • How can this data add value to some third party business process?

That’s pretty much about all there is to the IoT.  Piece of cake, right?  There’s a lot more detail to be explored beneath each of these aspects of course , but this simple framework should at least provide you with a starting point for brainstorming where you might want to play in the future of the IoT.  Here are four great resources / articles for digging further into those details:

One obvious consideration is your ability to access the data and devices.  Can you get access to the data, and at what cost?  Can you get access to a configurable device, and if so, can you voluntarily reconfigure it?  The flip side of this is security - If you are a device/server/data owner, can you protect your data and your smart devices from involuntary reconfiguration (i.e. hacking)?

iot_countBeyond that, the salient fact that should jump out at you is that there are infinitely more ways to add value via the network / cloud / third parties / connections / additional devices than through the direct device-to-server connection.  I flirted with this point in this previous post, “The Value is in the Network”, and I would reinforce that the devices are not the endgame, the IoT is not the endgame, even the customers are not the endgame - the Ecosystem is the endgame.

My emphasis in that previous post was around monitoring the network and enhancing your data management / integration/ exchange capabilities across that network. The IoT raises the bar to from simply monitoring to “managing” your network, actively managing your ecosystem, cultivating partners whose devices, servers and data and add value to your own, and vice-versa.   On the IoT, the sum is greater than the parts.  If in your business model 1+1+1 only equals 3, you are quickly going to find yourself pushed aside by an ecosystem where the sum comes to 4 or 5.

For better or for worse, the smartphone is becoming our remote control for life.  But it’s just a remote.  The value is in the content, and the content is coming from all corners.  If you are an IoT player, it isn’t even 'remotely' about you anymore. But it is about you AND your friends.  Successful IoT business models will come down to playing well with others.  Rather than hunkering down behind your IoT firewall, get out there and make friends, starting with making it easy for potential friends to play with you.

Post a Comment

Diagnosis: Your data is not “normal”

“Let’s assume a normal distribution …”  Ugh!  That was your first mistake.  Why do we make this assumption?  It can’t be because we want to be able to mentally compute standard deviations, because we can’t and don’t it that way in practice.  No, we assume a normal distribution to simplify our decision making process – with it we can pretend to ignore the outliers and extremes, we can pretend that nothing significant happens very far from the mean.

Big mistake.

There are well over a hundred different statistical distributions other than “normal” available to characterize your data.  Let’s look at a few of those other major categories that describe much of the physical, biological, economic, social and psychological data that we may encounter as part of our business decision and management process.

Risk%20mgmtThe big one when it comes to its business impact is what is commonly known as the “fat tail” (or sometimes, “long tail”).  These are Nassim Taleb’s “Black Swans”.  In the real world, unlikely events don’t necessarily tail off quickly to a near-zero probability, but remain significant even in the extreme, and as Taleb points out, become not just likely over the longer term, but practically inevitable.  It is these fat tail events that leave us scratching our heads when our 95% confident plans go awry.

image63Next up are the bounded, or skewed distributions. Some things are more likely to happen in one direction than in the other.  Unlike with a normal distribution, the mode, median and mean of a skewed distribution are three different values.  ZERO represents a common left-hand bound, where variables cannot take on negative values.  Many production and quality issues have this bounded characteristic, where oversize is less common than undersize because you can always remove material but you can’t put it back on (additive manufacturing excepted).  Too large of a part will sometimes simply just not fit into the tool / jig, but you can grind that piece down to nothing if you’re not paying attention (I have a story about that best saved for another post).

scilab-examples-010Discrete or step-wise functions might describe a number of our business processes.  We make a lot of yes/no, binary, or all-or-nothing decisions in business, where the outcome becomes either A or B but not a lot in between.  In these cases, having a good handle on the limited range over which making an assumption of normality becomes important.


325px-Poisson_pmf_svgPoisson distributions.  These describe common fixed-time interval events such as the frequency of customers walking in the door, calls coming into the call center, or trucks arriving at the loading dock.  Understanding this behavior is critical to efficient resource allocation, otherwise you may either overstaff, influenced by the infrequent peaks, or understaff without the requisite flexibility to bring additional resources to bear when needed.


325px-Exponential_pdf_svgPower laws.  Would you think that the population of stars in the galaxy follows a normal distribution, with sort of an average sized star being the most common?  Not even close.  Small brown and white dwarfs are thousands of times more common than Sun-sized stars, which are tens of thousands of times more common than blue and red giants like Rigel and Betelgeuse.  Thank goodness things like earthquakes and tornados follow this pattern, known as a “power law”.

2000px-Barabasi-albert_model_degree_distribution_svgMuch of the natural world is governed by power laws, which look nothing at all like a normal distribution.  Smaller events are orders of magnitude more likely to occur than medium sized events, which in turn are orders of magnitude more likely than large ones.  Power laws grow exponentially in hockey stick fashion, but are typically displayed on a logarithmic scale, which converts the hockey stick into a straight line (left). Don’t let the linearity fool you, though – that vertical scale is growing by a factor of ten with each tick mark.

Brunswick stock price chart2That’s financial data over there to the right – can you tell without the axis labels if that’s monthly, hourly or per-minute price data?  Or, it could just as easily be your network traffic, again measured by the second or by the day.  This type of pattern is known as fractal, with the key property of self-similarity: it looks the same no matter what scale it is observed at.  Fractals conform to power laws, and therefore there are statistical approaches for dealing with them.

One piece of good news is that when it comes to forecasting, you don’t have to worry about normality - forecasting techniques do not depend on an assumption of normality. Knowing how to handle outliers, however, is crucial to forecast accuracy.  In some cases they can be thrown out as true aberrations / bad data, but in other cases they really do represent the normal flow of business and you ignore them at your peril.  In forecasting, outliers often represent discrete events, which can be isolated from the underlying pattern to improve the baseline forecast, but then deliberately reintroduced when appropriate, such as holidays or extreme weather conditions.

What we’ve just discussed above is called data characterization, and is standard operating procedure for your data analysts and scientists.  Analytics is a discipline. One of the first things your data experts will do will be to run statistics on the data to characterize it – tell us something about its underlying properties and behavior – as well as analyze the outliers, all part of the discipline or culture of analytics.

Economists like to assume the “rational economic man” – it permits them to sound as if they know what they are talking about.  Likewise, assuming a “rational consumer” (customer data is going to comprise a huge chunk of your Big Data) who behaves in a “normal” fashion is pushing things beyond the breaking point.  While plenty of data sets are normal (there are no humans ten times the average height, let alone even twice), don’t assume normality in your data or your business processes where it’s not warranted.

Soon enough we’ll probably drop the “big” from Big Data and just get on with it, but still, your future is going to have a LOT of data in it, and properly characterizing that data using descriptive analytics in order to effectively extract its latent value and insights will keep your Big Data exercise from turning into Big Trouble.

Post a Comment

Transformations – Personal and organizational

A new year, and with it comes reflection and resolutions.  While few resolutions are actually kept, change comes anyhow.

freytag2I was reminded recently of a conversation I once had with a high school classmate who I had hardly seen since graduation.  We were discussing a third person, and my friend’s comment to me was: “I didn’t know him very well.  And for that matter, I can’t say I know you very well now, either.”  She was of course making the point that, with time, we all change.

And thank goodness is all I can say.  Not only am I not the person she once knew when we were both 18 and on our way to college (that Leo had, shall we say, some developmental opportunities ahead of him), by my reckoning I am currently working on Leo version 7.0, counting from my first, stable, young adult personality at age 15, and am still a work in progress.

My first four versions came in fairly quick succession between the ages of 15 and 28, followed later by longer, more stable periods.  If I had to summarize my experience of these transformations, it would be:

  • A series of relatively impactful events and environmental changes occur (A, B, C, D, E, F, …)
  • Followed by a specific trigger event “X”.
  • The trigger event highlights certain previous life events and gives them significance. While Trigger event X might spotlight events A, B and C, a different Trigger Y would perhaps have selected events D, E and F as being the important precursors.
  • The transformation is not a single moment, but encompasses a period of time on either side of the Trigger, and is often not apparent until some time has passed for reflection and assessment.
  • The transformation is a response to environmental stress, and enhances your physical, psychological and financial competencies for survival in light of that stress.
  • The transformation requires facing fears and taking risks.
  • In retrospect, the transformation looks like a typical story/plot outline, starring you as the protagonist.

Over a period of several months I continually revised my assessment of my transformations. It took me a while to settle on not just seven, but these particular seven, relegating some previous Triggers to mere events while recognizing other events as being the true Triggers and accordingly shifting the time periods in question.

The criterion I settled on for defining a transformation was:  Would I now (or my previous personalities) be willing to go back to being that person?  For example, while I would have little consternation going back to the Leo I was four years ago, that is not the case for the self I was twelve years ago – too much has been learned since then to voluntarily give it back, no matter the price I may have paid for it.  (Not all transformations can be considered positive, but I’m going to leave retrograde motion out of this discussion).

While I might like to be able to claim that I reinvented myself six times over, that would not only be stretching the truth, but more like misremembering and misrepresenting the past.  While I did get better over time at re-engineering each new version, none of the seven Triggers or transformations were deliberate on my part, but merely reactions to changes in myself and my environment.  Life was forcing my hand, not the other way around.

This leads to my first proposal:   We all need to more proactively manage our lives and transformations, and to that end, a life or career coach or mentor is probably not a bad idea.  Someone objective, someone with a broader perspective on the world than we might have, someone to occasionally shake us out of our comfort zone, but as part of a proactive plan instead of a reactionary Trigger.  Considering the increasing pace of technological and cultural change, this is more necessary today than ever.

I had a coach early in my career, but I think her contribution was more in the direction of stability than transformation, which as a new parent was probably exactly what I and the new family needed at the time.  However, I do wish now that I had continued to work with her – there was no need for that fifth transformation to have waited 14 years to commence.

My second proposal is that, following this model, organizations are probably in a better position to proactively trigger transformations than are individuals.  Organizations are much better suited to develop and compartmentalize the capability to objectively analyze itself, and then provide the incitement to change.  If not internally, this capability can also be readily acquired externally via change management consultants.

An entirely reasonable organizational approach to change would be to replicate the individual process by deliberately creating the preparatory, foundational precursor events A, B and C (the ‘rising action’), then instigating a Trigger (the ‘crisis’), followed by events D, E and F (the ‘denouement’) which completes the story of the transformation and becomes the new context in which the organizations understands itself and its mission.

Two factors are primarily responsible for the lack of both organizational and personal transformation.  The first is the lack of a vision, the lack of the transformative storyline / myth / context that I proposed above.  In an organization this is the job of the CEO; as for an individual – this is why the use of a career/life coach or mentor can be so beneficial.

The second factor is fear and risk.  For an individual the risk is typically emotional or financial.  For an organization not in financial straits, the analog to the individual’s psychological risk would be the lack of a well-defined strategy.  You know you need to be on the opposite river bank, and that the only bridge is weak and deteriorating and won’t be there much longer, but you hesitate because the other side is unknown territory.

One approach some organizations take is to spin-off their fearless, agile component and let them lead the way without the baggage of the larger organization.  Another approach is to hire a CEO or other talent with experience on the other side.  Or, you could scout the new territory, often with the help of outside consultants who have experience in that terrain, or utilize insights gleaned from your current business intelligence database.

Lastly there is the approach I discussed some time ago (“Having a strategy versus being strategic”) of simply making the commitment, crossing that river first and allowing your strategy to develop over time once you’re there and can make refinements based on real data rather than speculation.  As I admitted in that previous post, I am not necessarily comfortable with the idea of strategy as simply the sum of my tactics, but sometimes that approach may be just what’s called for.  If your future is on the other side, whether that be the love of your life and future spouse, or because technology is making your industry / market / business model rapidly obsolete, sometimes you just need to face your fears and make the leap.  On a personal level this is similar to the behaviorist approach of inverting the "Beliefs ---> Attitudes ---> Behaviors" model, and simply changing your behavior and letting your beliefs and attitudes catch up later.

Regardless of how you get there, personally or organizationally, eventually you ARE going to end up on the other side of that river, with many more rivers to cross in your future after that.  The question is:  Will you cross unwillingly and unexpectedly because the bridge is burning or the ground you’re standing on has given way, or will your transformation be a more deliberate affair, part of a purposeful journey or quest rather than a flight of necessity?

Post a Comment

Getting started with Supply Chain Segmentation

All unsuccessful segmented supply chains are alike; each successful supply chain is successful in its own way.” ― Leo Tolstoy Sadovy

Segmentation is the new big thing in supply chain management, or at least it’s an old big thing made new again.  It was the keynote topic at last month’s IE Group Supply Chain Summit in Chicago, and is typically addressed by at least a couple of speakers at every supply chain conference I’ve seen lately.

segmentation12The complexity of customer expectations and service levels, your product portfolio, the global supply chain, varied distribution channels, coupled with the internet and social media, makes moving from an undifferentiated to a segmented supply chain almost an imperative, even though doing so adds a layer of complexity that many manufacturing companies are not ready for.  To read the recent literature on the topic, when you start trying to combine segmentation based on your products with segmentation based on your customers, it goes from merely complicated to overly complex in a heartbeat.

Here’s a short list of just a few of the various segmentation strategies and permutations to consider:

  • Product-driven segmentation:
    • Large volume, long production runs, standardized operations
    • Limited editions, fluctuating demand
    • Made-to-order, low volume, short runs, high margin (high cost-to-serve?)
  • A volume / variability 2x2 matrix
    • High volume commodities
    • High volume seasonal or promotional items
    • Low volume, predictable
    • Low volume specialty or custom orders
  • A typical three-segment retail-oriented model:
    • Regular replenishment
    • Seasonal, but predictable demand (swimwear, lawn fertilizer)
    • Volatile, one-off demand (fashion, new products, promotions)
  • Customer-based segmentation – many ways to do this:
    • Standard, higher quality, or premium service / customization
    • By channel
    • By lead-time service level (build-to-stock, configure-to-order, build-to-order)
    • By customer size, volume or value
    • Other customer characteristics, such as vendor managed inventory, level of data and forecast/POS integration / collaboration, SLA penalties or geography
  • Risk-oriented segmentation, based on political, environmental or economic risk/disruption factors, and on product lifecycle stage considerations

I am a practical sort, concerned primarily with execution.  I want to make Pareto’s Law work for me and go after the low-hanging 80% that only requires 20% of the effort, and I want that first demonstrable success.  Lastly, I would be well advised to dust off the old adage – keep it simple, stupid – and that list of possible segmentation models above looks anything but simple.

The conference keynote case study mentioned above concerned a multinational alcoholic beverage company that was trying to balance the production needs of large volume, stable, established brands with the flexibility needed in a surprisingly innovative market that sees several hundred new products introduced every year.  Their big breakthrough was to move from a one-plant/one brand, one-line/one-product practice (largely inherited via multiple acquisitions over the years) to an agile approach where each line in each plant could handle any combination of product, bottle, label or packaging.  For example, before the changeover, there were some labels that had to be spun on clockwise, and other labels counterclockwise, which just by itself cut the number of available production lines in half.

With that in mind, and based on the success stories and key takeaways I’ve seen presented or in print, I think I’d approach my first supply chain segmentation project in the following manner:

  1. Get a good understanding of my cost-to-serve.
  2. Employ analytic forecasting.
  3. Take a product-oriented approach to the supply chain segmentation.
  4. Deal with my customer segmentation opportunities via inventory and service policy.

Breaking these down a bit further:

  1. Cost-to-serve. Before I do anything, I want accurate product, process, customer and channel costs on which to base my decisions, informed by a cost and profitability management solution that gives me cost output I can trust.
  2. Analytic forecasting. Because it all starts with the forecast. It can only get worse from there. Start higher in order to finish higher.
  3. Product-oriented approach. Yes, it’s inside-out thinking, but it seems to be where all the successful segmentation projects started from. It’s easier to understand and control than either working back from the customer or trying to bite off the entire holistic supply chain in one go.
  4. I’m still going to have to deal with customer and channel differences. What if a high-value customer wants a low-value product? We all know how that story ends – Lola gets what Lola wants. I need to accommodate my premium customers through some post-production combination of inventory policy, customer service/care, and order allocation/commitment process.

I can, however, imagine several scenarios where I might have to start from the customer and work backwards, such as having the federal government as a customer (where mil-spec products might necessitate a holistic supply chain approach all the way back to the farthest tier-n supplier), or when you have significantly different classes of customers who buy through distinctly separate channels. But for all practical purposes, you aren’t going to get one specific segmentation scheme that meets both all of your operational priorities and all your high-priority customer needs (and mitigates all your major supply chain risks).

One final bit of advice from the experts can be summed up as:  One physical supply chain with multiple virtual segmented supply chains running against it.  These virtual supply chains are distinguished by policy, not by brick-and-mortar – inventory, sourcing, production, fulfillment, logistics and service policies.  Because it’s easier to change policy than to change concrete and steel.

As nearly every supply chain expert stresses, one size does not fit all.  You need to select a segmentation strategy that’s right for your business.  But please do select just one appropriate strategy, not some unworkable hybrid. Unsuccessful supply chains are alike in that they tend to be more complex than they have to.

Post a Comment

Big Silos: The dark side of Big Data

big-data-image3The bigness of your data is likely not its most important characteristic. In fact, it probably doesn’t even rank among the Top 3 most important data issues you have to deal with.  Data quality, the integration of data silos, and handling and extracting value from unstructured data are still the most fertile fields for making your data work for you.  [And if I were to list a fourth data management priority it would be, as I described in this previous post (“External data: Radar for your business”), the integration of external data sources into your business decision support process]

Data Quality:  The bigger the data, the bigger the garbage-in problem, which scales linearly with data volume.  Before you can extract value from the bigness of the data, you need to address the quality of the data itself.  If you haven’t been employing robust, scalable data quality tools, now would be the time.

Have we gotten any better at data quality? My personal, one sample survey would indicate that we have not.  With a relatively unusual last name, Sadovy, although only six letters, I’ve seen it misspelled over two dozen different ways in my life, and I thought I’d seen them all by my mid 40’s.  But once my three children became college-aged and started receiving daily credit card offers in the mail, several new ways to misspell my name came to light, a credit to the creativity of today’s automated processing systems.  Even being a Smith/Smythe or Jones/Joens doesn’t leave you immune to a misplaced bit or byte.

Without a focus on data quality, big data just gives you that many more customer names to get wrong.

Data Integration:  If you’ve got a data silo problem, and who doesn’t, then all big data contributes to the process is to make those silos bigger.  Which makes the eventual data integration exercise that much more of a challenge.

Enterprise big data comes at you from a dizzying array of directions – from mainframes and ERP systems, from transactional and BI databases, from sensors and social media, from customers and suppliers. To make matters worse, each of these various sources and applications has its own, sometimes proprietary, data model.

And we’re still not finished with the complexities of this issue yet, because enterprise data has one more endearing quality that makes integration difficult – it’s decentralized and distributed. Extracting value from its bigness by creating one humungous centralized, homogeneous data warehouse is simply out of the question.  If Sartre had been a philosopher of data science he might have said, “Integration precedes value extraction”.

Unstructured Data:  Depending on what study you prefer, it’s claimed that 70 to 90 percent of all data generated is unstructured.  This unstructured bigness doesn’t readily fit into predefined columns, rows, data entry or relational database fields.  Customer feedback, emails, contracts, Web documents, blogs, Twitter feeds, warranty claims, surveys, research studies, client notes, competitive intelligence, often in different languages and dialects … the list goes on. Who has the time to read all this, let alone find an efficient way to extract the latent value from it?

Unstructured data may be both big and bad, but again, with the right tools, it’s not unmanageable. Text mining, sentiment analysis, contextual analysis – there are automated machine learning and natural language processing techniques available today to deal with the volume and ferret out the insights.

Big Data’ is of course a relative term, but when I think ‘big data’ one of the following three data categories seems to be in play:

  • High transaction volumes: Millions of customers, billions of transactions (i.e. ATMs or POS), or tens of thousands of SKUs crossed with other attributes such as retail locations, cost and/or service levels.
  • Temporally dense: Sensor data, audio.
  • Spatially dense: Video, satellite imagery.

The business issue becomes – what do you want to do with all this data? And the place to start is not with the data, or with its bigness, but with the business problems you want to solve, the business insights you want to gain, and the business decisions you want to support.  Starting from there and working backwards to the data means running squarely into the issues of data quality, data integration and unstructured text analytics.  It’s only after you get a handle on this trio of capabilities that you can begin to effectively tap the big data spigots.

Extracting tangible value and insights from high-quality, integrated data, no matter its volume, velocity or variety, is where the payoff lies. Getting to this payoff in an environment where your data is growing exponentially in all dimensions requires an investment in robust data management tools. The consumers of this data, the business users, don’t know or care about its bigness – they just want the right data applicable to their particular business problem, and they want to be able to trust that data. Trust, access and insights – it’s got “quality” and “integration” and “analytics” written all over it.

Post a Comment