Analytics, OR, data science and machine learning: what's in a name?

6
data science and machine learning feud but shouldn't

Casa di Giulietta balcony, where Juliet supposed stood while Romeo declared his love

Analytics, statistics, operations research, data science and machine learning - with which term do you prefer associate? Are you from the House of Capulet or Montague, or do you even care? Shakespeare's Juliet derides excess identification with names in the famous play, Romeo and Juliet.

"What's in a name? That which we call a rose
By any other name would smell as sweet."

Romeo was from the house of Montague, Juliet, the house of Capulet, and this distinction that meant that their families were sworn enemies. The play is a tragedy, because by the end the two lovers end up dead as a result of this long-running feud. Statistics, data science and machine learning are but a few of the "houses" that feud today over names, and while to my knowledge no deaths have resulted from this debate the competing camps have nearly come to blows.

"Operations Research? Management Science? Analytics? What’s in a brand name? How has the emerging field of Analytics impacted the Operations Research Profession? Is Analytics part of OR or the other way around? Is it good, bad, relevant, a nuisance or an opportunity for the OR profession? Is OR just Prescriptive or is it something more? In this panel discussion, we will explore these topics in a session with some of the leading thinkers in both OR and Analytics. Be sure to attend to have your questions answered on these highly complementary and valuable fields.” This was the abstract for a panel at the INFORMS Annual Meeting last year that included two past presidents of INFORMS and other long-time members. To many long-time members of INFORMS this abstract was provocative indeed, because not all embrace of the term "analytics" to describe what they do.

Interestingly, the American Statistical Association (ASA) is host to a very similar debate. Last fall the ASA released a statement on the Role of Statistics in Data Science. There’s a whole wing of data science practitioners who are downright hostile to statisticians. This camp makes assertions like "sampling is obsolete,” due to computational advances for processing big data. Popular blogger Vincent Granville has even said "Data Science without statistics is possible, even desirable," extending the obsolescence to statisticians themselves. INFORMS member and self-described statistical data scientist Randy Bartlett, who is both a Certified Analytics Professional (the CAP certification offered by INFORMS) and an Accredited Professional Statistician (the PSTAT certification from ASA), has written about this statistics denial in an excellent series of blog posts he publishes from his LinkedIn page. In the face of such direct attacks on their profession it is no wonder the ASA felt a need to take a stance.

Many statisticians assert "Aren't We Data Science?," as Marie Davidian (professor of statistics at NC State University) did in in 2013 in an article published during her tenure as president of the ASA. More recently David Donoho (professor of statistics at Stanford University) makes a similar but complex argument in a long-form piece, "50 Years of Data Science," which he released last fall (after a presentation on it at the Tukey Centennial workshop). Donoho is equally dismayed at much of the current data science movement. As he puts it, "The statistics profession is caught at a confusing moment: the activities which preoccupied it over centuries are now in the limelight, but those activities are claimed to be bright shiny new, and carried out by (although not actually invented by) upstarts and strangers." Donoho points out the harm in the huge oversight of the contributions of statistics while also exhorting academic statisticians to expand beyond a narrow focus on theoretical statistics and "fancy-seeming methods." He proposes a definition of data science, based on people who are "learning from data," drawing upon a remarkably prescient article John Tukey published more than 50 years ago, "The future of data analysis," in The Annals of Mathematical Statistics. In making his case, Donoho also points to subsequent essays by John Chambers (of Bell Labs and co-developer of the S language), William Cleveland (also of Bell Labs and arguably the one who coined the term data science in 2001), and Leo Breiman (of the University of California at Berkeley). These gentleman together argue for addressing a wider portion of the lifecycle of analysis, such as the preparation of data as well as its presentation, and the importance of prediction (and not just inference). I heartily recommend reading Donoho's excellent analysis.

While I haven’t seen a similar assault on operations research, there are those within the OR/MS community who see terms like analytics as a threat to the survival of OR. Several years ago, when INFORMS began exploring whether to embrace the term analytics, researchers surveyed the INFORMS membership and published their results in an article in Interfaces entitled "INFORMS and the Analytics Movement: The View of the Membership." Member views of the relationship between OR and analytics were roughly divided into three camps: OR is a subset of analytics, analytics is a subset of OR, and analytics is the intersection of OR and analytics. While the numbers may have shifted I doubt they have yet converged into a clear definition or consensus among members. There will always be naysayers - 6% of the membership surveyed at the time thought there was no relationship between OR and analytics, and for that matter there are statisticians who see no association with data science or analytics at all. Today INFORMS embraces the term analytics, describing itself as "the largest society in the world for professionals in the field of operations research (O.R.), management science, and analytics."  By adding the word analytics, instead of replacing operations research, INFORMS has shown that this is not an either/or question - it values its roots while acknowledging a present that includes those who describe their work as "analytics."

Trying to wrangle a clear definition of analytics, statistics, data science, machine learning, and even operations research can be as messy as cleaning up a typical data set! These are distinct disciplines, related but not the same. Increasingly analytics is used synonymously with data science, which is derived in large part from statistics, which also is a foundation for machine learning, which relies upon from optimization techniques, which I’d argue are part of analytics. These days I observe increased cross-fertilization, like my operations research peers clamoring to attend machine learning conferences and machine learning talks that are standing-room-only at the Allied Social Sciences Annual Meeting (where the economists gather).

We can parse terminology all day, but instead we should invest our energy in the opportunity at hand and drive towards increased adoption. Some of these buzzy terms get people’s attention and provide an incredible opportunity to use mathematically-based methods to make a significant impact. Last year my friend Jack Levis accepted my invitation to give a keynote at an analytics conference SAS hosted. He spoke about ORION, the OR project he leads at UPS that Tom Davenport has said is "arguably the world's largest operations research project in the world." While few of the conference attendees likely understood in great detail the operations research methods employed, all were amazed at the impact his team has had on saving miles, time, and money for UPS, happily tweeting their excitement during his talk. No doubt this impact is why Jack's team won the 2016 INFORMS Edelman Competition, which some call "the Super Bowl of OR."

The important advances in research presented at conferences like the Joint Statistical Meetings and the INFORMS Annual Meeting pave the way for progress that enables success in practice at places like UPS. We need academics and other researchers to continue to invest in advancing the unique approaches of their disciplines. Most INFORMS members would call what Jack's team does operations research. Most attendees at the conference where Jack spoke probably thought of it as analytics. Does it really matter what we call it, if people value what was done, want to share the story, and pave the path for adoption through their enthusiasm ? If it leads to the expansion of OR I don’t care if this application of operations research is referred to popularly as analytics, because after all, in the words of the bard, "a rose by any other name would smell as sweet." Each of the historic disciplines has a chance to have not only a bigger slice of the pie but a bigger slice of a bigger pie if analytics is embraced at large. I understand the value and pride in disciplines like operations research and statistics, and their contributions to data science and machine learning, which are the knowledge base analytics draws upon. We don't have to throw out older terms to embrace new ones. The houses of Capulet and Montague can celebrate their unique heritage but put an end to their feuds. This is not an either/or proposition, but I do believe it is our opportunity to squander. Instead of feuding let us learn more, put that learning into practice, and make the world a better, smarter place.

 

Image credit: photo by Adam W // attribution by creative commons

Share

About Author

Polly Mitchell-Guthrie

R&D Project and Program Management

Polly Mitchell-Guthrie leads the Advanced Analytics Customer Liaison Group in R&D, connecting with customers to improve SAS products. At SAS for 14 years, Polly has held a variety of roles in finance and alliances, and the Global Academic Program. She has a BA and MBA from the University of North Carolina at Chapel Hill.

Related Posts

6 Comments

  1. Daymond Ling on

    Polly, I completely agree with your that we should invest our energy in the opportunity at hand and strive for adoption by delivering value. This is all that matters at the end of the day.

    While historians examine and argue over the size, shape and pattern of our foot-steps in the un-named (analytics? no, can't call it that, statistics? nope... argh) sands of time, we should be boldly charting new paths to amazing places and enjoy the fantastic scenery and discoveries along the way. What does it matter what our foot-steps are called if we get to where we need to be? How would it matter if our foot-steps are correctly pigeon-holed into some arbitrary taxonomy but we are standing still and stagnating?

    When you go to a hardware store and buy a drill, you don't really want the drill, you want the hole that it can make. Similarly, organizations don't want statisticians, mathematicians, O.R. practitioners, or data scientists, they just want their problems solved. Our raison d'etre is to show them how it can be solved. We need to do whatever it takes for that problem, be it simple statistics or big data or optimization or time series or machine learning or any combination thereof. Just do it! Get it done! Enlarge the pie through success. While people argue what to call our markings, I'm going to celebrate our success with Champagne in my hands and a Big Smile on my face!

    What's in a name, truly, not much. Now let the flame war begin!

    • Polly Mitchell-Guthrie
      Polly Mitchell-Guthrie on

      Daymond, thank you for your insightful comments, as always. I love the analogy about what I want out of a drill. Great illustration! The business just cares that we get there, not path we take. It's up to us to sort through the various paths, which is its own challenge. Perhaps that is worthy of another post!

  2. Russell Greenberg on

    I always considered "MS/OR" a hybrid that shares the underlying philosophy of the scientific method and as such is malleable and can be extended to any methodologies and problem sets.
    My training, in the mid 70's consisted of non-statistical methods such as linear programming and statistical methods of queueing theory and Markov chains. The curriculum would be comfortable with glomming these and other methods in the field. I'm more comfortable now with the designation of "data scientist" which emphasizes this underlying philosophy.

    • Polly Mitchell-Guthrie
      Polly Mitchell-Guthrie on

      Thanks for the reminder of the term "management science," Russell. There were too many terms to cover them all, but I know plenty of INFORMS members who lament the gradual disappearance of that term, because they feel it does the best job of all of illustrating to a non-technical audience that there are ways to apply "science" to business. I remember one member commenting that no one understands when he says he does OR but described this way they get it.

      • Daymond Ling on

        So true! When I say management science or operations research, people don't understand the term; same with statistics to a lesser extent. I have resorted to describing what it does and people instantly appreciate its usefulness.

        Nowadays I walk around saying Data Scientist, Big Data, Machine Learning and Deep Learning. I'm sure there is no greater comprehension of what these really mean from the general public, however, the blank look and vacant headnod now is mixed with an aura of respect for scientific magic and mystique.

        We should re-visit this topic in another ten years to see what the buzz and jargon will be. In the meantime, let me get back to solving interesting problems. Wake me up when I need to re-label myself.

  3. Sheryl Feinstein on

    The term "analytics" is widely used now, and there are now many self proclaimed analytics professionals. Those who are mathematically challenged might be in awe of these experts, because they don't have the background to differentiate these folks from those who have a solid math/OR education.
    This is bad news for the mathematicians. In the past, they were members of an elite academic group. Now we have analytics experts, many of whom struggled through algebra in high school. These folks may not have ever heard of operations research, and often feel challenged when a OR person or mathematician offers to help solve a problem.

Back to Top