Dutch Data Science, Part 7: Longhow Lam

For part 7 of this series, I had the pleasure of interviewing mathematician, former colleague, and data science “rock star” Longhow Lam. Since there’s no need for an office for his one-person company, we decided to meet for lunch in a very hot (31 degrees) and sunny Amstelveen city centre.

Company Overview

About a year ago, Longhow became an independent consultant after years of doubting whether he could make it on his own. Well, he could, and he does! Longhow mentions several reasons for starting his own company, but the main one was definitely the freedom of deciding where to work and what to work on. In his first year he was hired by two different organisations (ING Bank and the SVB, a government agency) where he worked on several projects.

As a side business he also provides after-hours training sessions for novice R users. R seems to be a constant factor in his career; his first job in 1997 was for a software distributor who sold the S-Plus system, the commercial statistical package that can be considered as a sort of commercial predecessor of R. And his An Introduction to R book is still available for download on the Cran website. Nevertheless, Longhow notices a trend where Python is becoming more popular for data science purposes, at the expense of R. And though he loves coding to analyse data, he’s also a pragmatic senior data scientist who recognises the need for platforms to organise and govern multiple projects and can support different user profiles or personas. He also wonders why so many data scientists prefer spending hours of coding and writing 2K LoC when they could get the same result with a flow-based solution like SAS^® Viya^® in a matter of minutes.

Asked about his future ambitions, he’s very pragmatic, as well: creating value for customers using data and analytics.

Cool projects

Over the course of his career, Longhow worked on a truckload of cool projects, but the first one he mentions is the one he’s working on right now: speech to text conversion for the SVB, which receives about a million phone calls each year. And trying to figure out how to digitise and analyse the 1 million letters the organisation receives yearly, as well. His first “real” data science project dates back to the year 2000: predicting whether or not mortgage customers of a large bank were going to relocate, based on demographic and behavioural attributes. From his four-year tenure at SAS Netherlands, I remember a pretty scary one: using SAS Text Miner to build a model using 40.000 tweets by Dutch parliament members, then apply the resulting model on his inbox to determine the political preferences of his colleagues. The scary part is this: it was remarkably accurate. Definitely not GDPR compliant, so don’t try this at home.

Continuous Challenges

Operationalising analytics seems to be a recurring challenge for everyone I talk to, and Longhow’s no exception. It’s easy to build a model or derive insights from data, but the real work starts when it needs to be deployed, shared, managed, and adopted by an organisation. For himself he mentions finding the right projects to work on; everyone (especially recruiters) tries to window-dress simple reporting projects as innovative data science challenges, which makes it hard to separate the wheat from the chaff. His latest challenge (and frustration) is called “GDPR compliance.” Especially in public organisations, projects are stalled waiting for the lawyers to execute a required “privacy impact assessment,” which can take a long time in some cases.

It’s easy to build a model or derive insights from data, but the real work starts when it needs to be deployed, shared, managed, and adopted by an organisation. #DataScience #Analytics Click To Tweet

The future of AI and Machine Learning

Longhow has been in the industry for 20 years and is reluctant to make any predictions because “they will almost always be wrong.” Plus, trends come and go; when people get tired of one cool technological promise, they simply jump on the next one that passes by, like the blockchain. For AI he distinguishes between “narrow” and “generic” AI. In Narrow AI, very specific problems (like detecting anomalies in data or images) can be solved by algorithms, and often the computer does that better than any human could. Generic AI, however, is still far away, in his opinion – at least five or 10 years. There’s one prediction, though, that he’s willing to make: the demise of the data labs. In the past couple of years, many data and/or innovation labs have emerged, and lots of (open source) code has been developed. Code that’s not manageable anymore, and with too little proven business value. So, many of these experimental departments will cease to exist, simply because they have failed to deliver tangible value to the organisation. Let’s hope this will not result in dismissing the entire idea of using data and analytics for better decision making.

My take

Longhow is one the smartest data scientists I know; I always loved working with him, which occasionally I still do when we hire him to assist us with projects, or when we participate in a hackathon together. What distinguishes him most, though, is not his technical abilities but his creativity in problem solving or for applying data science in fields you wouldn’t have thought about yourself. For examples of this, have a look at his blog.

Blogs