Statistics: The language of science

A recent KD Nuggets poll caught my attention. It asked respondents to complete a sentence as follows:

“With the trend towards Big Data and Data-driven Machine Learning methods

  • Statistics will become less important
  • Statistics importance will not change
  • Statistics will become more important, as the foundation of Data Science
  • Not sure”

“Statistics will become more important” was the clear winner. While this poll is not a scientific survey, it’s still interesting to see what people are willing to take the time to express their views on. The word “statistics” has different contexts, and it's unfortunate that it has been viewed so negatively for so long by so many. This is perhaps why we’ve seen other terms come in (and out) of fashion: data mining, data science, predictive analytics, etc., to rebrand what are essentially statistical concepts. It’s ironic that many people put statistics in a box as though it’s forever one thing — only deals with small data, is about hypothesis testing, etc. — when statistics is fundamentally about dealing with change.

In other posts, I’ve written about statistics as its own discipline, how it is core to data analysis and value creation, and how statistical literacy is growing in importance (celebrating the first-ever International Year of Statistics). Jeff Leek of Simply Statistics has a nice YouTube video on The Landscape of Data Analysis showing that statistics is foundational to data analysis, and while other disciplines also contribute, they don’t contribute as directly.

The first time I heard statistics described as “the language of science” was many years ago in a conversation with David Salsburg, author of The Lady Tasting Tea and first statistician Pfizer ever hired. To be more scientific in any decisions you make — in science, in industry, in government — you will need statistics! And statistics needs you! Robert Tibshirani, eminent statistician at Stanford University, was quoted in The New York Times Bits blog last year: “Statistics is unusual.  … It’s a service field to other disciplines. It doesn’t rely on its own work. It needs others.” This is similarly expressed in a recent interview I did with Professor Shirley Coleman.

It wasn’t too long ago that universities required you to meet foreign language requirements, especially for graduate degrees in the sciences. It now appears that a new language — statistics — is working its way in to the curricula for degrees in science as well as business. Recently, a proposal was made to establish a statistics curriculum within the chemistry departments of US colleges and universities. This is apparently true in parts of Europe as well, as reported in a recent issue of The Analytical Scientist. In addition to the evolving curricula in business schools to include more statistics/data mining/predictive analytics (and offering new degrees in these areas), even the hard sciences are incorporating more statistics to better prepare their graduates for jobs in industry.

Many of our customers confirm that they spend a few years investing in their new hires to instill in them best statistical practices because their academic training has not adequately prepared them to do the work that is needed. One of our longtime partners, Predictum, has been offering courses like Data Analysis and Statistics for Scientists and Engineers since 1997.

Statistics as a word may have some baggage (many unfortunately did not have the best introduction to this powerful subject and think of it as “sadistics”), but “statistical thinking” is another term that casts everything in a more strategic light. Statistical thinking is being scientific about problem-solving, speaking the language of science in any given context, because what is science? Science — good science — is the efficient and effective way of understanding the natural and social world to be more informed, and make better use of that information. In a recent webcast with Russ Wolfinger, we got to see and hear about some really interesting applications of statistical thinking in science.

Good science and speaking its language with some level of proficiency, is required to derive value from the growing volume and complexity of data we continue to amass. May we all learn, at some level, the language of science so we can make more informed decisions, best utilize scarce resources and compel better actions.

Note: A version of this post first appeared in the International Institute for Analytics blog.

tags: Analytics, Exploratory Data Analysis, JMP - General, Modeling, Statistics

4 Comments

  1. Tammy Jackson
    Posted June 10, 2013 at 10:31 am | Permalink

    Mathematics has been known for a long time to be the language of science. This quote is by Galileo:

    “Philosophy is written in this grand book, the universe which stands continually open to our gaze. But the book cannot be understood unless one first learns to comprehend the language and read the letters in which it is composed. It is written in the language of mathematics, and its characters are triangles, circles and other geometric figures without which it is humanly impossible to understand a single word of it; without these, one wanders about in a dark labyrinth.”
    Galileo Galilee in Assayer

    Statistics is the application of mathematics to the scientific method.

    • Joal
      Posted July 17, 2013 at 9:14 pm | Permalink

      I would agree that mathematics is the language of science.

      The expression I've heard is that statistics *is* the scientific method. This was a revelation to me as a university student. I went into the subject half wondering why statistics was even a computer science subject, and came out wondering how science could possibly do without it.

  2. Posted June 10, 2013 at 10:45 am | Permalink

    Great post, and very timely. I just finished reading a truly excellent anthology of short articles about data, with the unappetizing title, "'Raw Data" is an Oxymoron." Despite the title, I recommend it highly. It supports your viewpoint that data is/are the building blocks of science.

    • Anne Milley Anne
      Posted June 13, 2013 at 9:43 am | Permalink

      Appreciate the comments. Raw Data does look like an interesting read--thanks for recommending it. Math and statistics have an interesting relationship. Many used to think statistics was a branch of mathematics. Statistics certainly uses math, but it is its own discipline--and a relatively young discipline. David Salsburg also considered statistics "the science of science." The subtitle of his book, The Lady Tasting Tea, is "How Statistics Revolutionized Science in the Twentieth Century." Wonder what Galileo would say today. We are fortunate to live in such interesting times.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>