Everyone has a unique career path. In data science, it often starts with curiosity about a subject that can only be untangled by applying analytics. In Robert Blanchard’s case, he was studying economics and wanted to learn more about consumer behaviors as they relate to buying patterns. His focus was on using online behavior to develop price discrimination models. His research required more than just a background in economics, it required learning analytics to complement the economics. As Robert studied analytics, he started learning SAS which eventually led to a career in data science.
Robert is someone that I’ve had the opportunity to work with over several years. His role as a data scientist at SAS has afforded him the flexibility to live where he wants, in his case, on a beach in San Diego. It’s a move that has soured our personal relationship with heaps of jealousy. However, I didn’t want that to stand in the way of sharing his insights from working with data scientists across industries.
What’s your job title?
Officially, Principal Data Scientist. Unofficially, Aspiring World Traveler. It’s “aspiring” because I’m certainly not a world traveler yet, but I hope to be someday.
What is the most interesting problem that you have been working on in your current role?
It is hard to narrow it down to only one. I just completed a customer project where we used machine learning to improve an engineering process and I really liked that project because we used a few areas of machine learning that really interest me including explainable ML/AI and hyperparameter search. But in general, my most interesting projects usually involve some form of computer vision technologies.
To arrive at where you are now, what did you focus on learning as you transitioned from graduate student to a career in the field of data science?
I needed to change my analysis perspective when I transited from economics to analytics. In economics, we rely on the appropriate methods of inference to detail economic relationships, whereas in analytics we commonly seek to predict or classify the unknown. This adjustment, although subtle is quite profound because we move from identifying all patterns to identifying just those that are repeatable and persistent in the data. And that is really the key to prediction, identifying patterns or relationships that persist across data and time.
What technologies and methods do you primarily use?
I’ve spent several years leveraging traditional machine learning models. For example, generalized linear models and tree-based models. But in the last three to four years my focus has shifted to deep learning with a heavy emphasis on computer vision. Now, my work focuses on explaining computer vision models to understand their potential biases. I’m also researching neural architecture search strategies for deep learning models. I believe there are still considerable gains and benefits to be made in these areas.
Do you primarily spend your time coding or using visual tools?
I primarily spend my time coding, but for some projects, the visual tools are very nice to work with. I see the value in both types of tools and I like working with both.
What was the biggest lesson you learned on the job?
Data will always be dirty, and models will always have errors. And the latter is a good thing because it is through error our models learn and improve. Without error, the learning process models undertake would cease.
Regarding the lesson I’ve learned that “data is always dirty,” I adhere to the mantra “trust but verify.” Trust what you’re told about the data but verify for yourself. Run cross-tabulations and examine bivariate relationships. Explore the margin distributions, and when the time is right, explore the conditional distributions.
What emerging trends are you seeing in the field of data analytics? How are roles evolving and what new skills should analysts focus on?
I see synthetic data generation having increased importance in the future. Especially as the models that generate the synthetic data get better and better. Imagine removing data, or lack thereof, from your list of limitations. Furthermore, you can strengthen your model’s generalization performance through the generation of “edge cases” which are highly improbable instances, that are not impossible but may not exist in your historical data. For example, imagine a plane flying through a lightning storm. This is a rare case in which you may not have much data representing these instances, but they are scenarios that are possible. In addition, imagine that the limitless data that you generate is already “clean” which could save the data scientist many hours.
Have you worked on any data for good projects?
I participated in a data for good hackathon a few years ago where we were predicting wildfires. And early in my career, I volunteered my analytic skills to the Full Belly Project based out of Wilmington, NC. But unfortunately, I have not had an opportunity to work on many data for good projects. I have thought about volunteering my data science skills to the Bill & Melinda Gates Foundation, but to date, I haven’t pursued that idea.
This question really touches on one of the great aspects of data science – you can have an impact on the world. The impact you make is up to you and if you want to make the world a better place then applying your data science skills to data for good projects is an excellent way to go.
How has your career helped you build the life you want?
In my opinion, it all starts with doing what you love and working on a task that you genuinely find interesting. After that, the rest of the pieces just seem to fall into place. For me, data science allows me to learn, create, and grow so I never feel bored or stagnated. I can literally do my work from anywhere in the world, so when my wife’s career led us to San Diego, SAS supported me in the move. Now, I’m just a beach bum data scientist.
When is your favorite time to code and what is your brain fuel? LIKE… 5-hour energy
You know me well! 5-hour Energy is my favorite brain fuel.
I may not have a favorite time of day to code, but I do have a favorite mindset. I find that I code best when I’m working on something novel with others where my work will be shared with teammates or combined with their work. I suspect I enjoy the collaborative aspect and for some reason, I enjoy coding better in those scenarios.
What step of the analytics life cycle do you work with?
Most of my work falls under the model-building process within artificial intelligence. It’s good to know how to operationalize your models, but I usually rely on teammates who are better than me at that for those tasks.