The Importance of Being a Data Scientist

This post is a nod to one of my favourite plays, The Importance of Being Earnest by Oscar Wilde. As the title ‘Data Scientist’ becomes more common, what can we gain about the importance of titles and labelling from this century old play?

For those that haven't read it, the story revolves around a man who goes by the name Earnest and has a reputation for being earnest (i.e. truthful and trustworthy). He is much loved by a lady for having the name Earnest - she has always wanted to marry a man with that name, believing men named Earnest are earnest (deep breath).

Well it turns out his name isn't really Earnest (irony 1 - he's not actually earnest despite his name) and the lady considers dumping him, but by a comic twist of fate it turns out that it actually is Earnest (irony 2 - he actually was earnest even though he didn’t think he was).

So what is the importance of being <insert a name or title>? As another wit once said "a rose by any other name would smell as sweet". But is that true? Our experiences have probably told us "No". Despite whatever skills we may have, a title comes with a reputation and expectations. Whether that’s someone named Earnest actually being earnest … or a Data Scientist being a magician with big data.

Data_ScientistThere have been many attempts at explaining what a data scientist is since the term was first coined in 2008 – Wikipedia, HBR, KDnuggets, Marketing Distillery – but the general definition is someone who encompasses equally high skill levels in:

a. Statistics.
b. “Hacker” programming.
c. Communication.
d. Business.

The number of people that actually satisfy this definition is a popular subject in discussion forums and papers, and it’s interesting to also ask from what perspective the attributes are judged (good communication skills from a marketer are expected to be different than good from a programmer). But what everyone agrees is that many Data Analysts, Data Miners, Statisticians, Econometricians and the myriad of other titles over the last 50 years, all have these attributes, but in varying proportions.

I've met many analytics practitioners over the years from different parts of the world. Some quantitative analysts have chosen to change their titles to Data Scientist to make them more attractive to employers as the Statistician and Data Miner titles go out of favour. Some, who are very close to the purist definition of data scientist, may not title themselves as such, adamantly sticking with the title they have had for many years.

On the other hand, in many cases, employers who advertise for Data Scientists are actually looking for:

  • Quantitative analysts with innate curiosity to learn and innovate – a trait of most people from mathematics, sciences, engineering and economics disciplines.
  • Candidates who meet some minimum criteria in the four attributes – many of which can be taught.
  • Those who are strong in a subset of prioritized attributes to suit a function within a team.

Magnifying_PeopleOver the years, given the right drivers, these partially defined Data Scientists could become strictly defined Data Scientists – but in a collaborative team environment, you will likely find that having a whole team of these individuals is not important. The two realities are that there are far fewer examples of organizations looking for the latter than the former, and these tend to be for commercial research and development arms; and individuals that embody all the attributes of a Data Scientist in the “right” amounts are rare.

Therefore, for those looking to hire Data Scientists, my advice is:

  • It’s much more important to first start with understanding the functions and expectations of the team within an organisation.
  • Then create roles that fit the needs of that team and “be much more specific about the type of worker you want to be or hire” (Tom Davenport, Wall Street Journal).
  • Be realistic of current skills in the market and tertiary education programs available.

For those looking for a role as a Data Scientist:

  • Start developing the attributes you are weakest at through classroom and self-service training because all round skills are always sought after.
  • Keep developing the attributes you already excel at as the big data analytics market is constantly evolving.
  • Stay curious of new techniques and worldwide trends.

So, the importance of being a Data Scientist is to be more attractive to employers, but that what employers are usually looking for is some flavour thereof, rather than a strictly defined criteria. Even though a prospect may not be the purest definition of a Data Scientist, he or she may turn out to be just what an organization needs. Therefore, be sure of what you want to be and what you’re looking for and don't judge prospects and opportunities based on a title… lest you end up dumping your Earnest before he becomes an Earnest!

Learn more. Stay curious.


Post a Comment

Taking advantage of the analytics opportunity

The rise of analytics and big data presents a once-in-a-generation opportunity for organisations to put themselves at the cutting edge, to create a competitive advantage by developing a culture of analytical success within their organisation. Yet most seem unable to grasp the opportunity that is within their reach.

Eric Hoffer wrote that "In times of change, learners inherit the Earth, while the learned find themselves beautifully equipped to deal with a world that no longer exists." So why in this time of rapid change are our organisations failing to invest in educating their most important asset - their employees?

Organisations should be investing time and money into programs and courses to supercharge their staff for success with analytics in business.

With the recent advances in analytical programs and training, many staff will be new to the latest education. Even those with relevant education may not have had the chance to complement their knowledge with industry experience, as a result they may be missing essential domain knowledge and communication skills.

Three things you can do to invest in your organisation’s biggest asset and secure a competitive advantage:

  1. Involve staff in classroom training courses and workshops available that provide the skills and experience needed to succeed in a rapidly changing environment. Couple this with professional accreditations where they are available.
  2. If struggling for initial budget to invest, access complimentary training to demonstrate the value created. For example, at SAS we have made our Programming 1 and Statistics 1 eLearning courses available at no cost, accessible online. Showing the ROI through investing time to develop skills can build support for investment in further education.
  3. Look out for some of the full degree University programs being developed around Data Science, Business Analytics and Advanced Analytics. Integrate these into individual’s professional development plans.

Interested in learning more about education programs? Please leave a comment below or tweet me @james_enoch


It doesn’t matter how many resources you have. If you don’t know how to use them it will never be enough.

Post a Comment

Top 5 skills you need when applying for a Data Scientist role

In guest lectures I give at universities, I often refer to the Harvard Business Review report which states that being a Data Scientist is the sexiest job of the 21st century. Naturally, this always seems to capture the students’ attention, and drives their enthusiasm to sit up and listen carefully. As a result, one question I am consistently asked by inquisitive students is, “What skills do I need to become a Data Scientist?”

In my experience with solving analytical problems and conversations with customer looking to hire their next Data Scientist, I have drawn out the 5 most sought after skills you need to consider when applying for a Data Scientist role. Here are my top 5:

5. Know how to develop a predictive model using regressions and decision trees

This doesn’t sound too sophisticated to a pure statistician, I know, but businesses want to know the best outcome for a particular event. The most common business questions asked are, “Which customers are most likely to leave? Which customers will take up a product? Which customers should I approve for a loan?” In most cases a regression or decision tree results in an easy to explain predictive model to address these questions. More importantly, they are easy to productionise to meet the organisation’s demands.

This is a data science skill that contributes to companies creating a more targeted customer experience to increase profits while reducing marketing spend – a fantastic result for organisations!

4. Know how to develop a segmentation model

The first thing organisations want to do with their database is understand the characteristics displayed by their customers. And of course, from a marketing perspective, they want to know what groups of customers look like and what makes the groups different. Applying the skill of clustering (and there are many different kinds in this discipline), to obtain cohorts or segments that are distinct, is extremely valuable in driving successful business outcomes. It may be one of the first tasks you are asked to perform in a Data Scientist role.

3. Know how to use SAS with R

It is no secret that R is the common tool of choice for many students graduating from Information Management courses. But the reality is, when you need to apply analytics to commercial corporate data that is often exponential in growth, you must be able to incorporate R skills with SAS skills. Organisations have invested in analytical enterprise platforms that are often already embedded successfully into the organisation's model lifecycle environment. So, those who bring a variety of both SAS and R skills will make valued Data Scientists.

2. Know how to access relevant data quickly

About 80% of a Data Scientist's work is focussed on knowing where the appropriate data is housed and how to access relevant data quickly. In my experience at one particular company I worked at, I developed a data dictionary for every data source I needed to access. The data dictionary was like the Holy Grail to fast and accurate data extraction, and let me get on with the science of deriving insights from the data. Everyone wanted a copy of the data dictionary… even IT!

1. Know how to articulate your analytical results to drive business outcomes

Communication is critical to your success. I often practice communicating the business outcomes of my analytical results with colleagues that aren’t analytically inclined. Once I know that they understand the value of the results from a business context, only then am I satisfied that my analytical results are actually useful to my organisation. You need to move away from statistical jargon and be creative with how you illustrate data and models in a business friendly manner. Do it in a fashion that is easy on the eyes and ears – you don’t want to scare people away, but rather welcome them into the journey of business analytics.

The tips described above aren’t rocket science. However, they’re not something you develop overnight either. If you can practice these skills continuously, and keep them at the front of your mind when applying for a Data Scientist role, then you will be a step ahead of your competition. Success awaits!

All opinions are my own, and based on my experience, conversations and feedback from the professional field and customers looking to hire quality Data Scientists.

Post a Comment

Five Keys to Successful Stress Testing

Stress testing is not new to the risk world but has been a major focus since the GFC (Global Financial Crisis). For a number of years now, stress testing has helped analytical specialists quantify various aspects of potential loss. What is new is the introduction of regulatory stress tests which has added to the work load that banks face. Banks are also faced with increasing complexity, increasing frequency and the firm-wide nature of regulatory and economic scenarios that are required to be tested. So stress testing programs are placing increasing demands across an organisation’s people, processes and IT systems – particularly banks and now our insurance companies in Australia. This change in scope has introduced several challenges. On the technology side, the challenges are mainly due to the scale and complexity of the underlying data required to build out the scenarios and stress tests. This is stretching the capabilities of existing computing resources to deliver timely responses.

Our banking and insurance customers are very focused on stress testing. In fact,  I engaged with a number of Australian customers this year to explore stress testing automation soon after teams spent weeks of late nights and weekends completing the first round of APRA’s annual stress test.

Our discussions uncovered a common set of challenges across institutions:

  • Limited stress-testing framework: Stress testing has historically been an isolated exercise by a bank’s risk function, different stress events and assumptions are used in each model and outputs cannot be easily aggregated into a meaningful, combined result.
  • Lack of granularity: Today’s systems are often incapable of providing the granularity needed in each asset class at the individual position or facility level.
  • Insufficient coverage: Disparate data in large quantities prohibits a consolidated view of all assets. This is further exasperated when trying to bring risk and finance data together.
  • Inconsistency: Multiple versions of data, valuation methods and models persist, making it practically impossible to achieve consistency among calculations and measures.
  • Organisational silos: While many trading desks and risk groups conduct stress tests to supplement their risk analyses, they are done in silos by individual analysts, making it impossible to verify that assumptions are consistent.
  • Coordination across functions: Consolidating and analysing the interdependencies across strategy, risk, budgeting and treasury activities adds even greater complexity.

Different functions across multiple business lines and geographies may participate in strategy formulation, stress testing and capital planning.

Based on recent work with customers, leading consulting companies and auditors, SAS identified the following key focus areas required to deliver a successful stress testing program:

  • Efficiency: Integration of existing risk models and data hierarchies into a streamlined data infrastructure for firm-wide stress testing. Data, computations, reporting –all must be part of a unified platform.
  • Performance: The efficient aggregation of results for all major risk models across the organisation with the ability to run complex, forward looking stress tests with multiple parameters.
  • Enterprise View: Views of economic capital and pro forma financials at the enterprise level with a view of market, credit and liquidity risk. Leading practice is to extend this to include iterative rounds of planning in terms of anticipated business growth and treasury funding activities.
  • Transparency: The ability to understand and document model assumptions, design and structure so they are readily apparent to management and regulators with the elimination of the “Black Box” characteristic of some models.
  • Compliance: The capability to address major regulatory stress testing issues as they evolve with the ability to integrate them into the risk decision making process.

To explore some of the advances in technology that can help you meet evolving requirements, listen to the on-demand webinar “Stress testing: A fresh point of view.” Industry experts discuss how banks are using analytics to:

  • Run full cycle stress tests in days instead of weeks;
  • Create consistent and repeatable processes; and
  • Handle numerous valuation methods, disparate data and stress testing models.
Post a Comment

How does big data power your electricity?

1397487016440The energy & utilities industry as a whole has experienced a seismic shift over the past five years due to rising costs and price pressures, and has become a priority discussion on the political and media agenda. Falling demand overall combined with “peakier” peaks is making supply, forecasting and public perception management more difficult by the day.

For the gentailers (generator-retailers) there is a data-rich environment that is driven by a need for marketing, loyalty and retention, but also to ensure accurate forecasts are aligned with demand and that sound and profitable trading decisions are made. Competition is fierce in the industry as Australia has moved to a contestable market in most jurisdictions. This competitive pressure has meant we are seeing some of the highest customer churn rates in the world – pushing towards 30 percent including connects and disconnects in some states. This has forced retailers to be very focused on what they need to do to maintain customer profitability and prevent customer churn.

An industry driven by customer data

The industry comprises two types of organisations – the network side of the business and the gentailers – both of whom approach their customers in vastly different ways.

On the network side of the business there are continuing cost pressures. The recent modernisation of the network infrastructure has contributed the greatest cost inputs into the overall price in the past five years particularly in the larger jurisdictions such as NSW. This period of increased investment has coincided with the Global Financial Crisis, as well as other environmental, economic and technology factors altering the country’s energy profile in a way never before experienced. As a result this has created a strong move and political will to drive down costs on the network side of the business and for the companies to engage better with their stakeholders.

The ability for utilities companies to better understand their customers, optimise pricing models, and manage demand to meet individual requirements is the driving force for keeping customers.

Data and analytics is helping predict customer behaviour and identify those customers who are most likely to churn and the points at which it is likely to happen. For us as the consumer it means they’ll use that insight gleaned from all marketing channels – call centre, digital and other forms of marketing – to really understand us and ensure our pricing plans are based on our needs.

Blog Matt

Video: Hear how Spanish energy company Endesa has used SAS to improve its performance using SAS leading to a reduction in churn and increased profitability.

And the proof is in the data. For example a Spanish utilities organisation, Endesa, exists in a highly competitive market due to a move from a regulated to a deregulated industry. SAS Analytics was able to help them get to the core of what their customer was thinking and get in front of them in predicting and intervening on the likelihood of churn. Not only has it meant increased customer satisfaction and loyalty but over two years they’ve reduced churn by 50 percent.

Understanding the customer to drive retention is just one part of the complexities faced by energy and utilities organisations. The other part is understanding a customer’s behaviour and in particular how they generate demand for electricity. Interestingly in Australia demand is actually falling and becoming much more based around peak times. Utilities need to understand exactly when those peaks are going to happen so trading decisions are optimised. And network companies are focused on more accurately predicting requirements and identifying where they should apply their capital budget to assets in the short, medium and longer term.

The changing energy landscape in Australia

The uptake of solar and renewable installation across Australia is also contributing to a change, particularly around peak load forecasting in ways traditional forecasting methods struggle to predict. Through advanced analytics gentailers are using the data available to better understand the profile of solar and renewable users and how it affects load predictions, particularly on the peak days. For the company, the service provided is much more aligned with reality and investment isn’t wasted on unnecessary infrastructure.

As an example, one of the utilities organisations in this region has been able to trade much more effectively based on significantly reducing their MAPE (Mean Absolute Percent Error). Typically in ANZ, this has been often 10 percent or higher. This SAS customer has been able to reduce that to figures considerably less than 3 percent. For the company this translates to millions on their bottom line and provides the understanding of the actual requirements of energy users. Increasing accuracy rates means utilities don’t overbuy on electricity – which in turn has a positive effect on consumer electricity bills.

Post a Comment

Thinking like the customer: the value in mapping the customer journey

CRMData-driven marketing is all about how marketers can harness data and analytics to create a more customer-centric, fact-based approach to customer engagement. This, combined with quality execution leads to better customer experiences and improved customer equity.

However when looking at customer-brand interactions in silos such as in the call centre and separately online, we don’t gain an accurate view of their lifecycle – we need to be able to look at the entire customer lifecycle and every touch point it involves across all channels including call centres, in-store and online.

Now, it’s easy to say all of this in theory, however when it comes to actually implementing and or obtaining data on all these touch points – especially when you have thousands, if not millions of customers – the process becomes somewhat complex. It’s no wonder that some marketers struggle to get a holistic view of how they should be engaging with their customers.

This is where customer intelligence comes into play.

Data plays a huge role in mapping out the customer journey, which is one of the key enablers for understanding individual preferences and motivations to create that feeling of appreciation and recognition. Not only does this build long term value into the conversation, it ensures consistency in messaging across sales and services efforts.

It’s important to ask ourselves, as professional marketers, how we could go about our job differently in an attempt to further enhance our perception and knowledge of customer lifecycles.

It is now widely accepted that customer centricity is a priority in the boardroom. Recognising that customers are the most important asset to any organisation is crucial, and many have come to this conclusion through practice. However, the challenge is determining how to develop this holistic relationship with individual customers. If marketing is unable to gain this understanding, they will struggle to position themselves strategically in the boardroom.

So, how do you avoid the struggle and embed customer centricity in the company’s DNA?

Think of how you approach personal interactions and relationships – when and how did you last speak with that person? What was the context? Was it on the phone, email or in person? Was it a defining moment for them, and if so was it positive or negative? All of these are valid questions that can, and should be considered when creating a mutually valuable interaction with another individual, whether it be personal or with a customer. Customer intelligence helps accomplish this. It allows you to scale that understanding into a holistic communication program across multiple channels, ultimately enabling you to take advantage of those ‘defining moments’ and make tailored offers or decisions that are relevant and timely to that individual.

The relationship you form with a customer needs to reflect and or be the same as the interpersonal relationships you have formed in your own life. So, things like common interests, past conversations, preferences – all that jazz – it has to be accessible and utilised in order to build a long term relationship with that particular customer. Customer intelligence does exactly that – provides readily available information that can be used at one’s own discretion, to specifically target that personas needs.

Post a Comment

What ever happened to customer segmentation?

Customer SegmentationI recall that in the not too distant past customer segmentation, in its many guises, was once a flavour of the times. Segmentation was (and still is) however, a greatly misused term, with organisations confusing the (correct) approach of offering a strategic view of a customer base with the more tactical initiatives of predictive modelling.

I have come to be aware that ‘true’ customer segmentation seems to be disappearing as an input from the planning cycle.

To recap, traditionally there have been a number of ways for organisations to segment their customer base. A few examples of how segmentation could be performed are by:

  • Profitability.
  • Life-stage.
  • Behaviour.
  • Attitude.

The choice of segmentation would always depend on the business application.

Behavioural segmentation will utilise a large number of behavioural related data fields to define a relatively small number of roughly homogeneous segments, each of which is distinct in terms of its dominant characteristics.

Such systems are quite different from traditional segmentation schemes which are largely intuition-led (rather than data led), and are defined in terms of very few fields (Recency-Frequency-Value is typical of this class).

Segmentation should always have been considered as a powerful tool within customer relationship management and its role is at the heart of customer relationship building.

As previously stated, a statement such as “derive segments for my next campaign” is completely at odds with what a ‘true’ customer segmentation should stand for. Customer segmentation should not be confused with specific campaign activity.

So what does segmentation stand for?

In my view, it should be to use available data or information to create natural groupings of customers or prospects which:

  • Simplify the description of a large heterogeneous customer base or market.
  • Suggest tailored marketing strategies, objectives, hypotheses and idea generation.

Importantly, it is worth remembering that there is NO UNIQUE solution and what is best will depend on the application.

Applications and benefits

Segmentation needs to be considered as a planning tool, usually for marketing, it would lend itself to:

  • Improved relationship building through tailored customer management and marketing strategies.
  • Segment-specific targets & performance monitoring.
  • Cross-selling plans.
  • Broad brush targeting.
  • A new sampling frame.
  • Framework to measure overall success of customer strategies, as opposed to campaign-by-campaign evaluation.

So the words “planning” and “strategy” are key to what segmentation is meant to deliver against. The reference to “targeting” has been prefaced by “broad-brush” as segmentation as an approach will deliver an inexact allocation of customers to segments.

Although historically segmentation systems have been derived through the use of complex multivariate statistical methods, a more sensible approach has been to combine logical rules with approaches such as clustering.

If segmentation is a planning and strategic tool then targeting and predictive models can be viewed as a series of “knitting needles” that cut through the segments selecting individuals for specific marketing activity. True, there may be more customers selected from certain segments but customers are likely to be selected from ALL segments.

There may be still be significant variations of customers within each segment; therefore communicate with individual customers and not segments.

Customer segmentation as a planning tool

This weapon in the marketing planners’ armoury appears to have fallen out of favour in recent times – more and more organisations appear to ignore this powerful strategic tool.

Why might this be the case?

  • The term “customer centricity” for one thing might have steered organisations towards solely thinking about more tactical activity, focused at the individual customer level, and actioned through specific targeting criteria for a very specific action. However, that specific action should have been driven from a thorough understanding of the customer base, with a key part of that coming through identifying the customer segments that exist
  • Organisations talk in terms of the “segment of one”; this is not an advancement on traditional segmentation applications, it is getting to know a customer as an individual, which of course requires investment in new technologies and advanced analytics. But it is not “segmentation” in the true meaning of the term
  • The approach to planning has changed. Have marketing planners and strategists simply turned their back on segmentation as a tool to help them? This may be due to:
    • Lack of understanding as to what segmentation is and how it can help
    • Perceived other priorities, with the headlong rush to be  reactive (to competitors) rather than be proactive and drive activity through segmenting the customer base  and “knowing your customers”
    • Lack of analytical support and skill to develop a tailor made segmentation system

Why is segmentation still important?

Customer segmentation as described here should still be considered an important part of the marketing planning process and will give organisations an enhanced ability to tailor marketing and management strategies.

Segmentation can deliver an even more powerful tool given the increase in the volume and type of data now available for analysis. The ability to incorporate unstructured, web and social media data, campaign response and channel data as drivers of segmentation systems, means that segments can be understood in even greater detail and hence are even more powerful inputs into the planning cycle

Historically it could be a difficult task for the analyst to explain segments and what constitutes the typical features of an individual segment and what makes a segment different from another.

With the advent of powerful visualisation tools on the market (for example SAS Visual Analytics) it is possible to easily demonstrate the key features and differences of and between segments. Additionally the ability the report on segment migration and membership through a range of specific segment reports can be viewed through such tools.

These will undoubtedly make the understanding of segments far easier and hopefully reinforce the value and acceptance of segmentation at the heart of customer relationship building.

Post a Comment

The Role of Data Management in Risk and Compliance

200426063-001As a Data Management expert, I am increasingly being called upon to talk to risk and compliance teams about their specific and unique data management challenges. It’s no secret that high quality data has always been critical to effective risk management and SAS’ market leading Data Management capabilities have long been an integrated component of our comprehensive Risk Management product portfolio. Having said that, the amount of interest, project funding and inquiries around data management for risk have reached new heights in the last twelve months and are driving a lot of our conversation with customers.

It seems that not only are organisations getting serious about data management, governments and regulators are also getting into the act in terms of enforcing good data management practices in order to promote stability of the global financial system and to avoid future crisis.

As a customer of these financial institutions, I am happy knowing that these regulations will make these organisations more robust and stronger in the event of future crisis by instilling strong governance and best practices around how data is used and managed.

On the other hand, as a technology and solution provider to these financial institutions, I can sympathise with their pain and trepidation as they prepare and modernise their infrastructure in order to support their day to day operations and at the same time be compliant to these new regulations.

Globally, regulatory frameworks such as BCBS 239 is putting the focus and attention squarely on how quality data needs to be managed and used in support of key risk aggregation and reporting.

Locally in Australia, APRA's CPG-235 in which the regulator has provided principles based guidance has outlined the types of roles, internal processes and data architectures needed in order to have a robust data risk management environment and to manage data risk effectively.

Now I must say as a long time data management professional, this latest development is extremely exciting to me and long overdue. Speaking to some of our customers in the risk and compliance departments, the same enthusiasm is definitely not shared by those charged with implementing these new processes and capabilities.

Whilst the overall level of effort involved in terms of process, people and technology cannot be underestimated in these compliance related projects, there are things that organisations can do to accelerate their effort in order to get ahead of the regulators. One piece of good news is that a large portion of the compliance related data management requirements map well with traditional data governance capabilities. Most traditional data governance projects have focused around the following key deliverables:

•      Common business definitions1397487016440

•      Monitoring of key data quality dimensions

•      Data lineage reporting and auditing

These are also the very items that the regulators are asking organisations to deliver today. SAS’ mature and proven data governance capabilities have been helping organisation with data governance projects and initiatives over the years and are now helping financial institutions tackle risk and compliance related data management requirements quickly and cost effectively.

Incidentally, our strong data governance capabilities along with our market leading data quality capabilities were cited as the main reasons SAS was selected as a category leader in Chartis Research’s first Data Management and Business Intelligence for Risk report

The combination of our risk expertise and proven data management capabilities means we are in a prime position to help our customers with these emerging data management challenges. Check out the following white papers to get a better understanding of how SAS can help you on this journey.

•      BCBS 239: Meeting Regulatory Obligations While Optimizing Cost Reductions

•      Risk and Compliance in Banking: Data Management Best Practices

•      Best Practices in Enterprise Data Governance


Post a Comment

Treat Yourself with Text

Did you check your email and/or favourite social media site before leaving for work this morning? Did you do it before getting out of bed? Don’t worry, you’re not alone. I do it every morning as a little treat while I “wake up” and whether I realize it or not, sometimes it sets the tone for the rest of the day.

PhonebedThe other day I was looking at my Facebook news feed and a couple of things drew my attention. One of them was an abridged transcript of an interview on Conan of the writer of the A Song of Ice and Fire books, which the TV series Game of Thrones is based on, George R. R. Martin. Because I don’t have much time in the mornings, if the article was much longer I would probably have stopped partway through. If I had, I would have missed this quote right at the end: “I hate some of these modern systems where you type a lowercase letter and it becomes a capital. I don’t want a capital. If I’d wanted a capital, I would’ve typed a capital. I know how to work the shift key.”. This put a smile on my face and in a good mood for the rest of the day.

This made me think.

What if I had woken up just that 5 minutes too late to have read the whole thing or the article at all? How many other interesting things could I have missed because I didn’t have time or had a “squirrel” moment? And as everybody with a social media account knows, there is too much to delve into everything.

With all that data, how can I be sure that I’m exposing myself to the most interesting information? Maybe it’s just a matter of looking at how many people have viewed something. But then I like to think I’m unique so that doesn’t really work. Maybe I can only look at things my closest friends recommend. But that’s more about being a good friend than being interested. Sorry friends.

Let’s take this situation to your organisation. How do you know what information is relevant to your business? There are a myriad of techniques to analyse the structured data that the data warehouse folks have invested a large amount of time designing and the business folks have spent vetting. But how about the unstructured data – the text-based data like survey responses, call centre feedback and social media posts? This data accounts for up to 90% of all information available for analysis.

How can we make use of this text-based data? Should you have people read through all the documents and give their opinions on themes and trends? But there is an inherent bias and unreliability as people have different backgrounds and perspectives. And it’s unlikely that 1 person would be able to read everything in a timely manner. On top of all this, what we need is more than just a word search. It’s more than word counts or word clouds. It’s more than just discovering topics. In fact, what we really need is to attach a measure of prediction to words, phrases and topics.

  • Escalate an incoming customer service call to the client relations team because the caller has used 3 key “future churn” phrases in the right order.
  • Redesign a product because very negative feedback always contain words that are classified under “aesthetic features”.
  • Discover the true customer service differentiators which give a positive Net Promoter Score (NPS).
  • The areas law enforcement should increase its presence to protect the public from activities being promoted on social media that are likely to have dangerous outcomes.
  • In a B2B scenario, understand and communicate the gaps in knowledge of the client organisation based on the volume, topics and severity of support calls they put through to the service organisation.
  • Determine the root cause of employee concerns and the best methods to manage them.

We need Text Analytics to structure the unstructured text-based data in an objective way.text globe

There are 2 sides to Text Analytics:

  • A statistical approach where text data is converted into numerical information for analysis, and words and phrases are grouped by their common pattern across documents. The converted data and groupings can then be used alone or combined with structured data in a statistical model to predict outcomes.
  • A linguistic approach or Natural Language Processing (NLP) where logical text-based rules are created to classify and measure the polarity (e.g. of sentiment) of documents.

Both sides are equally important because despite how far advanced computing algorithms have gotten, there is still a lot of nuance in the way people speak, like sarcasm and colloquialism. By using techniques from both sides in an integrated environment, we can create a whole brained analysis which includes clustering of speech behavior, prediction of speech and other behavior against topics, and prediction of the severity of sentiment towards a product, person or organisation.

One organisation which has been using Text Analytics from SAS for a number of years to provide pro-active services to their clients is the Hong Kong Efficiency Unit. This organisation is the central point of contact for handling public inquiries and complaints on behalf of many government departments.  With this comes the responsibility of managing 2.65 million calls and 98,000 e-mails a year.

"Having received so many calls and e-mails, we gather substantial volumes of data. The next step is to make sense of the data. Now, with SAS®, we can obtain deep insights through uncovering the hidden relationship between words and sentences of complaints information, spot emerging trends and public concerns, and produce high-quality, visual interactive intelligence about complaints for the departments we serve." Efficiency Unit's Assistant Director, W. F. Yuk.

Whatever size your organisation is, and whatever purpose your organisation has, there are many sources of text-based data that is readily available and may already be amongst the structured data in your data warehouse, Hadoop or on a network drive. By using this data to supplement the structured data many people are already analyzing, we can better pinpoint not only what is driving behavior but how we can better serve our customers and our employees. Wouldn’t it be great to know what is relevant to people in and out of your organisation without having to manually read thousands of documents?

Applying Text Analytics to your documents is treating yourself with Text because amongst the masses of words you will find nuggets which will brighten up your (and your company’s) day.

If you’re interested in treating yourself with Text and are in the Sydney region this July, sign up for the SAS Business Knowledge Series course Text Analytics and Sentiment Mining Using SAS taught by renowned expert in the field Dr Goutam Chakraborty from Oklahoma State University. Dr Chakraborty will also be speaking of his experiences in the field at the next Institute of Analytics Professionals of Australia (IAPA) NSW Chapter meeting.

Happy Texting!

Post a Comment

Are you riding the analytics curve?

Ever looked into what being “behind the curve” means? I’ll save you the time – it is being less advanced or getting somewhere slower than other people. Remember grade school? No-one likes being behind the curve. That’s where people tend to get crunched. So how do you know if you’re ahead or behind the curve? Measurement. Thanks to the poll we ran in our last newsletter, you’ll be interested to know that most of your peers in Australia and New Zealand stated that they were either “behind the curve or comparable”.

Coincidentally when we ran our poll question, MIT Sloan Management Review and SAS surveyed 2000 executives globally which resulted in a report titled “The Analytics Mandate”. Results showed that, “organisations have to do more to stay ahead of the curve,” as Pamela Prentice, SAS chief research officer, put it. “Nine in ten believe their organisations need to step up their use of analytics. This is true even among those who report having a competitive advantage.”

For us in the antipodes, there’s good news and bad news. The good news is that you are not alone! Don’t feel bad - if you think your organisation is behind the curve with your analytics mandate, so do your global peers. Now’s the right time to shine and open that window of opportunity to get ahead of the curve!

Unfortunately, there’s bad news too. Isn’t there always?! Being behind the curve might not seem like a big deal. Your organisation is still profitable, right? Assuming you are and that you’re not going anywhere, have you heard the saying, “if you snooze you lose”? That’s the real issue. It’s not about what you’re not doing. It’s about what your competitors are doing. How will your business survive if this is not part of your future strategic goal?

I refer back to the Analytics Mandate research that states – an analytics culture is the most significant driving factor in achieving a competitive advantage from data and analytics.

Changing the culture of an organisation does not happen overnight. In fact, it is probably the hardest transition to make as you try to juggle people, process, technology and the big O - Opinion! But once it is achieved, the business outcomes and competitive advantage are rewardingly accelerated. Here’s proof from Australian Natural Care, a medium-size online retailer of vitamins and supplements, “We have taken the results we’ve been getting to board level to report on the wins and everyone is pleased. We went from having no analytic capabilities to building analytical models within four weeks of implementation. This has had a direct impact on our entire business.”

How does that happen? To drive analytical aspiration and encourage an analytics culture, the Analytics Mandate study recommends you answer the following:

1. Is my organisation open to new ideas that challenge current practice?
2. Does my organisation view data as a core asset?
3. Is senior management driving the organisation to become more data driven and analytical?
4. Is my organisation using analytical insights to guide strategy?
5. Are we willing to let analytics help change the way we do business?

How did you go? Did you score five out of five? If so, then you’re on your way to analytical competitive genius!

If you need some guidance, then you need to read Evan Stubbs book, Delivering Business Analytics: Practical Guidelines for Best Practice to get some ideas on mandating the Analytics Mandate.

And if you were analytically enthusiastic enough to read this complete blog and are the first 25 readers to complete this Business Analytics Assessment, I will send you a signed copy of his book.

See you on the other side of the curve!

Post a Comment
  • About this blog

    The Asia Pacific region provides a unique set of challenges (ความท้าทาย, cabaran) and opportunities (peluang, 机会). Our diverse culture, rapid technology adoption and positive market has our region poised for great things. One thing we have in common with the rest of the world is the need to be globally competitive while staying locally relevant. On this blog, our key regional thought leaders provide an Asia Pacific perspective on doing business, using analytics to be more effective, and life left of the date line.
  • Subscribe to this blog

    Enter your email address:

    Other subscription options