Data scientist or statistician: What's in a name?


My name is Tonya, and I’m a Statistician.  It’s true.  I’ve been one for almost 20 years now.  I was crunching numbers and analyzing data long before statistician landed on the sexiest profession list.  And now, it seems like that claim to fame has faded in the face of an even hotter job title – the data scientist.

When SAS unveiled our new High Performance Analytics suite of products to a group of analysts earlier this year, it sparked an interesting twitter debate on this topic:

27-Feb-12 22:25 | ColinJWhite:  #sassb SAS doesn’t seem big on the data scientist term.  Yes, there is a lot of hype behind the term but surprised SAS not joining in.

27-Feb-12 22:16 | doug_laney: @dmenningervr You’re old school.  J Yes, don’t confuse the data scientist responsibilities with the job title. #sassb

27-Feb-12 22:14 | jamet123:  @doug_laney @dmenningervr @merv don’t turn everyone into data scientists but make everyone able to participate in data science! #sassb

27-Feb-12 23:58 | Claudia_Imhoff:  @ColinJWhite Interesting – what term does SAS prefer if not data scientist? #sassb

27-Feb-12 22:20 | merv:  RT @jameskobielus: #sassb Haven’t yet heard #SAS mention “data scientist” at this event, re #BigData analytics.  They would say “=SAS user”

Obviously, there are a lot of opinions about the terminology.  Here’s my perspective:  The rise and fall of these titles points to the rapid pace of change in our industry.  However, there is one thing that remains constant.  As companies collect more and more data, it is imperative that they have individuals who are skilled at extracting information from that data in a way that is relevant to the business, scientifically accurate, and that can be used to drive better decisions, faster.  Whether they call themselves statisticians, data scientists, or SAS users, their end goal is the same.

As the adoption rate for new technology accelerates, one of the key trends that I’m seeing across numerous organizations is the need for these personnel to be able to adapt quickly to changes in their IT environment while continuing to meet the rising demand for their services.

For example, Hadoop clusters are increasingly popping up alongside relational databases in order to scale storage capacities and accommodate ever-growing volumes of data.  Ideally, analysts need to be able to access and analyze these data without having to learn a whole new set of tools.  The SAS/ACCESS Interface to Hadoop provides seamless connectivity between SAS and Hadoop.  This technology enables the SAS user to continue to use the tools that they know and love while analyzing data that is now stored in Hadoop.

Similarly, with just a few tweaks to their code, analysts can take advantage of SAS High Performance Analytics to analyze big data in a massively parallel, in-memory computing environment.

The bottom line is that new technology needs to be delivered in a way that has minimal impact on how the analyst does his job while delivering maximum value to the business.

John P. Kotter, in Leading Change, says, “The rate of change is not going to slow down anytime soon.  If anything, competition in most industries will probably speed up even more in the next few decades.”

Organizations that are able to embrace change and adapt to it quickly will have a decided advantage over those who lag behind.  And, the people with the skills to analyze and understand big data - whether they be called statisticians, data scientists, or SAS users – will be leading the way.


About Author

Tonya Balan

Senior Manager, Analytics Product Management

As Senior Manager of the Analytics Product Management team, Tonya Balan is responsible for providing strategic direction for all aspects of SAS’ suite of analytical products. Her team works closely with SAS customers and development partners to define requirements and influence enhancements to SAS’ data mining, forecasting, text mining, operations research and statistical analysis products. Prior to joining the product management division, Balan worked as a technical consultant for data mining and taught numerous statistics and data mining courses for SAS’ Education Division. In addition to her experience at SAS, she has extensive consulting experience and has served on the faculty of the North Carolina State University Department of Statistics. Balan holds a Ph.D. in Statistics from North Carolina State University in Raleigh, NC.


  1. To quote Shakespeare, "a rose by any other name would smell as sweet." So it is with "data scientist" and its aliases: statistician, data analyst, applied statistician, or---my favorite---statistical programmer.

    Just as an "applied mathematician" is a mathematician who works on real-world problems, a "data scientist" is a statistician who works on real-world problems, which in
    today's environment often means large data using various software tools. In short, a data scientist uses software to analyze data, perhaps doing some modeling along the way.

    If you are a statistical programmer who uses SAS, you might want to subscribe to my blog ( for tips, techniques, and discussions about how to use SAS software to analyze data efficiently.

  2. Tonya,
    I believe that you are arguing that the term "data scientist" is synonymous with statistician, quantitative analyst, etc. But I am doing some research on this topic (supported by SAS), and I have concluded that they are truly different. They spend most of their time "munging" data, not analyzing it. They filter, clean, structure, categorize, etc. data. Most of the analysis is actually reporting--visual and otherwise. Some statisticians are data scientists--the ones who really like to get their hands dirty with data--but not all are. And there are more data scientists with backgrounds in physics, biology, ecology, psychology, cognitive science, etc.
    Tom Davenport

    • Hi Tom,
      My point was not necessarily to say that "data scientist" is synonymous with "statistician" but instead to say that there is an important and evolving role ceneterd around analyzing big data that needs to be filled in most organizations. The job function is more important than the job title. That said, I agree with you that while there are some statisticians who are now playing this role, there are many people stepping into the role from very diverse backgrounds as well. I believe that these people identify much more readily with the title "data scientist." I look forward to seeing your research on this topic. Thanks!

  3. Personally, I think it's just the latest attempt to clarify what's actually an extremely difficult role. I don't think it's synonymous with any existing title (including statistician, mathematician, analyst, data miner, and so on) but on the same note, I don't think it's where we'll finally end up.

    From my perspective, the market is trying to create a shorthand signal that describes someone with:

    * An applied (rather than theoretical) focus
    * A broader (rather than narrow) set of technical skills
    * A focus on outcomes over analysis
    * A belief in creating processes rather than doing independent activities
    * An ability to communicate along with an awareness of organisational psychology, persuasion, and influence
    * An emphasis on recommendations over insight

    Existing roles and titles don't necessarily identify those characteristics.

    While "Data Scientist" is the latest attempt to create an all-encompassing title, I don't think it'll last. On one hand, it's very generic. On the other, it still implies a technical focus - much of the value of these people stems from their ability to communicate, interface with the business, be creative, and drive outcomes. "Data Scientist", to me at least, carries very research-heavy connotations, something that dilutes the applied and (often) political nature of the field.

    We'll see though - as far as a defined profession goes, I think we're still very much in early days. I really liked Wayne Thompson's suggestion: Ninja Miner. Not sure how that'd go in the big corporates, though ...

Back to Top