Avoiding the pitfalls of multi-channel customer engagement

PitfallIt seems like everyone is searching for ‘best practice’ these days. We are constantly looking to learn from what is being held up as good, leading and perhaps even the best itself. While this is a valid exercise, I believe we are missing an opportunity to take a closer look at ‘bad practice’. That’s when either people, processes or technologies create a scenario in which the business case doesn’t hold true and where the project – or parts of the project – eventually fails.

However, we really should take that opportunity because there are important, valuable and very tangible lessons to take away from most such cases.

1. Avoid approaching a one-to-one customer engagement project as a data integration exercise

Building a data-structure that supports the organisation learning about customers’ interactions, responses and preferences over time is a data integration discipline, but the initial approach needs to avoid looking at it that way. It might be that the organisation has 50 different systems with customer and market data in them, but the incremental value of integrating the last 45 of them might not be very significant and only hold very little competitive advantage. I’ve seen organisations spend 12 to 18 months on data preparation and data design but by the time their communication actually hits the channels, things have almost certainly changed. The customer might have new priorities and competitive forces could have shifted.

So my advice is – spend time thoroughly understanding where the valuable use case and competitive differentiation is and build the data processes, the analytics and the automation to address your highest priority use case. Doing so will get to a business outcome much faster. Moreover, it makes it much easier to ask for additional funding to add new data sources, new channels and grow your model’s maturity.

2. Don’t overlook or underestimate how data-driven customer engagement impacts your current way of working

Tailoring emphasis and investment to an analytical way of going to market is easy in theory but hard in practice. Intelligent, real-time recommendations to point-of-sale systems or call centres are only smart if they are being actioned. Call centre workers are not marketers and the churn rate in such teams is often high. So work with them and ask for their input as to how offers and service messages should be served in order to make their everyday life easier. Ask what they think could create a better customer experience in their customer touchpoints. This will not only refine your requirements, but will also start the much required change-management process at an earlier stage.

Take the same approach when aspiring to analytically optimise customer contacts – right message, at the right time – you know the mantra…Recognise that optimising won’t work at all if the business process is designed in a way that has brand managers or branch executives assigned to groups of leads/customers to market themselves to by the beginning of the month or quarter.

3. Don’t focus on functions and features before balancing them against a solid understanding of the implementation team’s skills and experience

Having been through buying cycles, implementation projects and even business-as-usual states a few times now, it always strikes me how much time and effort an organisation will invest in a near-FBI-style interrogation of functions and features when they are choosing systems to drive their multi-channel customer engagements. I’m not saying functions and features aren’t important, but the weight buyers attach to them needs to be balanced against a thorough understanding of the people and skills their vendors can provide in order to deploy and support the software in a timely and high-quality fashion.

As with my ‘overlooking’ and ‘underestimating’ point above – the days are long gone when marketing was just nice to have and ‘so be it’ if campaigns got delayed a little. If marketing is critical in driving tangible sales and customer experience outcomes then systems selection and implementation require a close relationship with the software vendor/system integration partner. This will ensure the business implements the right functions and features it needs within the right time and of the right quality.

Take the Customer Intelligence Assessment

Post a Comment

Science fact: “Model Factory” means getting the basics right.

I press a button, a miracle machine churns through all the calculations in the world and the answer to the Ultimate Question of Life, the Universe, and Everything[1] is produced as a single number. Oh hang on, that’s 42. Alright, for our microcosm, let’s stick to the answer to my customer’s behaviour, my products’ demand, when the widgets of my machines are going to fail or how my customer feels. Too much to ask?

Once upon a time this was science fiction. Not anymore.

As people across all parts of the organisation jump onto the potential of data, we naturally start asking more, bigger and harder questions. And we want the answers now. Minority Report[2] may be in the near future but for today, how do we just keep up with everyday demand and still have time to think of new questions to ask?

We may need our own miracle machine to give us space and time back.

Model FactoryIn practice, we want something that works in the background to analyse data and derive predictions, with little requirement to change or interact, but at the same time is robust and trustworthy so that the results and justifiable and make logical sense. Our machine for producing these predictions, a “model factory”, should automate, accelerate and maintain governance over a series of logical processes across the Analytics Lifecycle from data preparation to exploration to model development to deployment & monitoring.

This comes down to a machine that has well-oiled technology, has clearly defined logic and is up-to-date and maintained. Building and running this machine will take 8 Ps: people, process and product (technology), possibilities and the old saying “preparation prevents poor performance”. Being prepared just means getting the basics right.

Possibilities: look beyond the "safe zone" for goals to strive towards.

  • Engage people (internal and external) who have done it before to develop on ideas and plan a realistic roadmap.
  • Keep an ear out for new trends in process improvements and analytical techniques e.g. conferences, association meetings, publications.
  • Allow dedicated time and/or resources to experiment with existing and new data sources to learn dynamically and determine the next innovation.

People: create a culture to attract and retain the right people to create, maintain, update and interpret the machine.

  • Give people direction and guidelines but room to create and innovate e.g. by providing separate processes and technology environments.
  • Create a team of people with various skill sets – business, domain, technical, unicorns – but who speak a common language and have a common goal.
  • Keep the day job interesting with side projects e.g. enablement, secondment, research.

Process: implement, enforce and reinforce processes which improve productivity and question others.

  • Automate standard reports and make the others, as much as possible, self-service e.g. use interactive visualisation and Microsoft Office add-ins.
  • Give access to the right information to the people who need it e.g. common intranet site, locked down operational vs. dynamic discovery.
  • Document stages of the process in standard templates for reusability and governance.

Technology: match the right technology for the task at hand and leverage modern infrastructure.

  • Project objectives, user skills, time constraints and the format for consumption will dictate the technologies required to solve tasks e.g. exploration, operational, experimentation, integration with front-line interfaces.
  • Provide high-performance technology – machine learning, multi-threaded, in-database, in-memory, template-driven, workload-managed – to accelerate the cogs of the machine.
  • Integrate each stage of the Analytics Lifecycle with common metadata to improve seamlessness.

These basics are the first principles to a big bang “model factory” – everything else will fall into place. This machine is a living entity that will evolve over time as technology advances, analysis trends change and objectives are redirected, and in this way will need to be kept up to date and modern. But by getting the basics right, the machine, at whatever stage of its and your organisation’s evolution, will form the foundation for your future intergalactic purposes.

May your “model factory” not be in a galaxy (too) far, far away.

Learn more about Machine Learning, High-Performance Analytics, Unicorns (the people kind) and try SAS Visual Statistics at the links. For those in Australia and New Zealand, keep an eye out for webinars, Hands-on Workshops and conferences throughout 2015.

[1] The Hitchhiker's Guide to the Galaxy, Douglas Adams (1979)

[2] Minority Report, Twentieth Century Fox Film Corporation and DreamWorks SKG, Dir: Steven Spielberg (2002)

Post a Comment

Leveraging Analytics for a Successful Online Behavioural Advertising Strategy

In today’s era of digital marketing, advertisers have access to innovative tools and platforms, which enable them to provide Internet users with more personalized ads. Furthermore, in exchange for such helpful and free service, people do not mind sharing a little bit of personal information as it helps them find products that match their preference and also save them time and money. To ensure customers get interesting or relevant ads when surfing for products online, advertisers leverage online behavioural advertising (OBA).

In an OBA model, advertisers harness customers' information from mediums such as web, mobile and social media to understand their online and real-time behaviour and accordingly target them with relevant ads. By doing so advertisers ensure their advertisements make a maximum impact on the general public, which in turn increases purchase of products and services, thereby, increasing the marketer’s bottom line.

With the kind of benefits it has to offer advertisers, it is not surprising that OBA is big money, especially with companies like Facebook, Google, and Yahoo! making the most it. To put this in perspective, consider the incident that occurred last September when Facebook had an outage for 18 minutes. Per minute, Facebook generates about $22,000, so during the outage they possibly lost $400K. This amount may look miniscule when compared to Facebook’s annual revenues, but if you add in all the lost revenue from all the businesses that generate ad revenue on Facebook’s platform, a lot more than $400K was lost.

As platforms such as Facebook, Google, and Yahoo! have become universal content and media platforms, advertisers need to deal with the implications of convergence in the digital realm. Additionally, businesses have started capturing online and real-time data independently, to target customers with relevant products. For instance, most organizations that have a mobile app now capture and track geo-location information that can be used to suggest a relevant product. This has also changed the business realm to a more collaborative business ecosystem where data can be captured across internal sources like CRM and external sources like Facebook and converged. This convergence allows one channel to leverage features and benefits offered through other channels. The convergence of multiple channels and platforms makes it imperative for advertisers to take a holistic view of their environments.

In addition to convergence and keeping a track on what user’s click, post, like, and comment on, it is also necessary advertisers integrate and analyse all the resultant big data and use this information to better target online users. This big data is a collection of information from an array of sources including CRM, transactions, online digital analytics, advertising offerings, and data management platforms.

Obtaining a holistic view of customers is indeed a classic big data problem as advertisers need to capture, prepare, manage, integrate and analyse huge amounts of digital data coming in rapidly from a variety of sources. But the good news is that developments in advanced analytics, data visualization and big data processing power can make the task easier. Analysing data from multiple sources not only provides insights to advertisers on how consumers interact with brands and content, but also enables them to discover something new about the customer’s behaviour. Additionally, platforms available today enable deriving the insights on consumers and acting on these insights in real-time to create the maximum impact.

As advertisers work towards understanding the customer’s behaviour and targeting with relevant ads they should also learn to respect these new insights and the customers’ privacy. Though it is tricky and complicated, advertisers need to find the right balance between honouring people’s privacy while providing them with a relevant online experience.


Post a Comment

The Relevance of Data Management in the Era of Big Data

text globeData Management has been the foundational building block supporting major business analytics initiatives from day one. Not only is it highly relevant, it is absolutely critical to the success of all business analytics projects.

Emerging big data platforms such as Hadoop and in-memory databases are disrupting traditional data architecture in the way organisations store and manage data. Furthermore, new techniques such as schema on-read and persistent in-memory data store are changing how organisations deliver data and drive the analytical life cycle.

This brings us to the question of how relevant data management is in the era of big data? At SAS, we believe that data management will continue to be the critical link between traditional data sources, big data platforms and powerful analytics. There is no doubt that the WHERE and HOW big data will be stored will change and evolve overtime. However that doesn’t affect the need for big data to be subject to the same quality and control requirements as traditional data sources.

Fundamentally, big data cannot be used effectively without proper data management

Data Integration

Data has always been more valuable and powerful when it is integrated and this will remain to be true in the era of big data.

It is a well known fact that whilst Hadoop is being used as a powerful data storage repository for high volume, unstructured or semi-structure information, most corporate data are still locked in traditional RDBMs or data warehouse appliances. The true value of weblog traffic or meter data stored in Hadoop can only be unleashed when they are linked and integrated with customer profile and transaction data that are stored in existing applications.  The integration of high volume, semi-structured big data with legacy transaction data will provide powerful business insights that can be game changing.

Data has always been more valuable and powerful when it is integrated and this will continue to be true in the era of big data.

Big data platforms provide an alternative source of data within an organisation’s enterprise data architecture today, and therefore must be part of an organization integration capability.

Data Quality

Just because data lives and comes from a new data source and platform doesn’t mean high levels of quality and accuracy can be assumed. In fact, Hadoop data is known to be notoriously poor in terms of its quality and structure simply because of the lack of control and ease of how data can get into a Hadoop environment.

Just like traditional data sources, before raw Hadoop data can be used, it needs to be profiled and analysed. Often issues such as non-standardised fields and missing data become glaringly obvious when analysts try to tap into Hadoop data sources. Automated data cleansing and enrichment capabilities within the big data environment are critical to make the data more relevant, valuable and most importantly, trustworthy.

As Hadoop gains momentum as a general purpose data repository, there will be increasing pressure to adopt traditional data quality processes and best pracrices.

Data Governance

It should come as no surprise that policies and practices around data governance will need to be applied to new big data sources and platforms. The requirements of storing and manage metadata, understanding lineage and implementing data stewardship do not go away simply because the data storage mechanism has changed.

Furthermore, the unique nature of Hadoop as a highly agile and flexible data repository also brings new challenges around privacy and security around how data needs to be managed, protected and shared. Data Governance will play an increasingly important role in the era of big data as the need to better align IT and business increases.

Data Governance will play an increasingly important role in the era of big data as the need to better align IT and business increases

Whilst the technology underpinning how organisations store their data is going through tremendous change, the need to integrate, govern and manage the data itself have not changed. If anything, the changes to the data landscape and the increase in types and forms of data repositories will make the tasks around data management more challenging than ever.

SAS recognises the challenge faced by our customers and has continued to investment in our extensive Data Management product portfolio by embracing big data platforms from leading vendors such as Cloudera and Hortonworks as well as supporting new data architecture and data management approaches.

As this recent NY Times article appropriately called out, a robust and automated data management platform within a big data environment is critical to empower data scientists and analyst so that they can be freed from doing “Data Janitor” work and focus on the high value activities.

Post a Comment

Are you ready to graduate from spreadsheets to analytics?

Flexibility and nimbleness are terms synonymous with small and mid-sized businesses. They have the ability to react quickly to changing market conditions as they are made aware of them. Traditionally these businesses have lived in the world of spreadsheets - and why not? They are easy to use, very affordable and readily available to all staff across the business. However increasingly, they are realising there are a wealth of insights hidden within their data that once uncovered, can offer them a first-mover advantage and the opportunity to capitalise and stay ahead of the game.

IT departments of one

Most organisations of this size run a very lean IT team which requires finding those rare expertly-general skilled professionals to run everything from setting up computers to managing networks and dealing with internet security issues. Often these small teams do not have the bandwidth or desire to also become analytics experts. With our new generation of reporting and analytics tools your IT team does not need any analytics or programming skills as the creation of reports, dashboards and analytics are kept within the hands of the analyser.

IMG_5694 copyFootball NSW is a not-for-profit organisation that looks after 208,000 registered football players across the state. It employs 57 staff, one of which forms the entirety of its IT department – focusing primarily on desktop support. ‘Analytics’ was a meaningless term for them a year ago, before they introduced SAS® Visual Analytics to replace their spreadsheets. Their reason for turning to a full reporting and analytics tool was clear from the start, they had one question to answer:

“How do we use our data to better engage to our stakeholders – whether it’s councils, government, sponsors or member clubs.”

With that goal front and centre their small organisation is now using powerful visualisations to attract and retain participants focusing on the three F’s – football, facilities and finance. Answering new questions with their data such as:

  1. What do future numbers of football players look like and will councils have the facilities to cater for them?
  2. How do we provide our sponsors with information that provides value to them so they stay with us?
  3. What is the profile of our typical referee and how do we educate and retain them?

You might be thinking, yeah but how does a Football club relate to me and my organisation? The principle behind all of this remains the same; analytics is not just for the big guys, in fact small and mid-sized organisations can easily use analytics to discover insights without the need for specialist skills. In fact you don’t even need to purchase extra hardware as we enter the age of Cloud Analytics.

Start by dodging the Buzz-word bingo

Business is buzzing with terms such as ‘big data’ ‘industrial internet’ and ‘advanced analytics’. Companies are talking about needing to hire ‘data scientists’ and having ‘machine to machine’ conversations, but for most organisations the question of where to start does not involve any of these terms.

The best starting point for most businesses embarking on an analytics journey is to get back to basics by better understanding their internal data… For the average business, data is all over the place. It can be found in different applications (finance, HR etc) some of which may be sitting in the cloud, or in dusty places such as archives, storage devices or spreadsheets that have been buried deep within your filing systems. Identifying and bringing all of this data together in a ‘single version of the truth’ is the foundation for gaining deeper insights, more accurate reporting and improved confidence in your data. It’s critical when you’re faced with this environment to ensure you seek a solution that not only consolidates and standardises data to build an integrated data view but then allows you to tell a story that looks both to the past and helps hypothesise about the future.

You do need to start with a clear attainable goal in mind, and it doesn’t need to include ‘saving the world’ at step one. Ensure your objective will enable you to either show value quickly (payback value) or achieve something which will have widespread visibility within the business (an issue that no one has been able to solve, or a way of using data to look at a falling market in a different way for example).

The world is rapidly changing. The value of managing data as an asset is now becoming a topic for most boardroom conversations. SAS Visual Analytics for the Cloud gives small to mid-market businesses the ability not only to have those exact same conversations but to act on them immediately. Analytics is no longer just for the large banks or government departments, it’s an option everyone can now capitalise on, and those who are flexible and nimble have the most to gain.

Post a Comment

Top 20 Procs in SAS University Edition

SAS University Edition has been available for free download for six months – in that time we’ve seen 50,192,670 PROCs or DATA steps executed globally – that’s almost 4,000 hours duration!

Now, we were founded on stats so we thought we’d bring you some of the key metrics we’ve discovered over the past six months.

  • 50,192,670 PROCs/DATA steps executed
  • 13,634,561 seconds of duration for all PROCs (3,787.38 hours)
  • 40,403 unique systems have registered
  • 22,629 unique systems that have reported usage (56% of all registered)

SAS Analytics UTop 20 PROCs executed

Did your favourite PROC make it into the list? Can't see it? Ask us in the comments below – we have more data! Or you can head on over to the SAS Analytics U online community to discuss.

Obs. Name Number executed Sum of time spent executing
1 DATA Step 34,403,229 2,699,397
2 SQL 4,482,278 709,635
3 SORT 3,533,974 412,108
4 PRINT 1,561,451 1,753,789
5 MEANS 1,129,950 176,810
6 APPEND 475,796 39,004
7 PRINTTO 473,741 31,454
8 DATASETS 433,945 89,520
9 FREQ 340,295 410,420
10 TRANSPOSE 304,311 9,314
11 ROBUSTREG 300,183 171,981
12 IMPORT 279,215 365,478
13 MIXED 226,042 763,875
14 PLOT 206,895 11,386
15 UNIVARIATE 166,502 729,890
16 REG 154,523 1,372,852
17 CONTENTS 152,598 67,762
18 SGPLOT 148,290 388,157
19 SUMMARY 121,574 14,833
20 CORR 107,992 77,494

SAS University EditionSAS University Edition - Easy to access, easy to use.

Download and install the software yourself – no need to go through convoluted channels for software distribution. And it’s portable, so it goes wherever you go. Once you download it, you don't even need an Internet connection to use it. Writing and submitting code is easy (no, really!) thanks to a powerful graphical interface that provides point-and-click access to advanced statistical analysis software, no matter what level you're at – from introductory statistics to higher-level analytics classes.

Need some tutorials or training to get started? There are over 180 Tutorials available for free at your fingertips, while the SAS Programming 1 and SAS Statistics 1 eLearning courses are also available at no charge to get you going. SAS Education also has a wide variety of training courses to further your knowledge where needed.

Already a user? Tell us your experience with SAS University Edition in the comments below.

Post a Comment

Value from Business-Driven Stress Testing

Going Beyond Regulatory-Mandated Tests to Achieve True Risk Management

I regularly hear banking customers talk about ‘sweating their assets’ - leveraging their substantial investments in expanded teams of risk analysts, re-engineered processes and new risk systems for Basel II and III compliance – to gain better insights into their business.

In looking at the approaches taken here in Australia, I think it’s fair to say that most organisations recognise that risk management and stress testing – the latter is a topic of particular customer interest this year - are critical to making informed business decisions. There is a lot of valuable data and information available in risk systems that remains untapped by the broader business. On the stress testing front, most banks have only been able to focus on getting the tests across the line - doing much more has proved difficult due to the incredible effort required to coordinate the iterative process of testing across the enterprise’s businesses and systems.

Business Driven Stress Testing

Business Driven Stress Testing

After customers wrapped up stress tests earlier this year, there has been considerable discussion about improving the process through greater automation and moving beyond the regulators’ mandated tests by running additional business-driven scenarios. The goal is to apply the bank’s unique points of view in regards to the forecasted business environment - economic outlook, competitive strategy, capital raising activities and risk appetite for example – to better understand the tradeoffs between opportunity and risk. Many finance and risk practitioners, including myself, see this as the start of a period of greater focus on measuring risk-adjusted performance and making more risk-aware decisions.

In response, several Australian banks are increasing the scope of responsibilities, seniority and overall visibility of the committees and teams responsible for stress testing. This not only satisfies the governance expectations of regulators but will also increase the value derived from the enterprise wide planning process as a result of higher levels of collaboration and integration across strategy, finance and risk functions.

What’s Held Banks Back?

As a testament to the limited role stress testing has played in decision making, I recently reviewed a draft report based on a survey of banks in the US and Europe which highlighted that just 24 percent of respondents acknowledged making changes to their strategic decisions as a result of stress testing.

So why haven’t banks expanded their use of stress testing sooner?

  • Maturity: Many banks are still in learning mode when it comes to stress testing. This doesn’t solely apply to banks as regulatory authorities are also refining their approach based on what we learn from conducting more tests each year.
  • Complexity: Stress testing is no easy task when you consider the number of markets, operating units, products and customer segments served by a typical bank. Getting the required input from scores of people and systems across the enterprise is often characterised as herding cats.
  • Resources: It takes an incredible amount of time, people and resources to complete a round of the mandated stress tests, leaving few resources available for what is often seen as optional business-driven testing. This is compounded by a skill shortage that is only expected to get worse.
  • Data: Systems have been built in silos over many years and integration of the data required for stress testing has proven to be painful. Data quality issues are compounding the problem and has led APRA and global regulators to intervene with guidance and standards such as CPG 235 and BCBS 239.
  • Change: Keeping up with regulatory changes further restricts capacity to move beyond mere compliance. Banks hire staff, change systems, build capabilities and get good at delivery only to find that the requirements have changed.
  • Engagement: Getting boards and management excited about a new business-driven approach will take time. Executives have not found use for the mandated stress tests which tended to focus on systemic risk and overly simplistic models instead of the bank’s unique strategy, plans and economic conditions.

SAS and Stress Testing Automation

Anyone who has spoken with me about stress testing will know that I get excited about sharing how customers are using SAS stress testing capabilities as a modern management tool. We excel in this space and have enjoyed public recognition of our solutions – most recently by AITE, an independent research and advisory firm known for its finance and risk systems expertise. In a crowded field of well-known vendors, AITE rated SAS a stress testing leader and particularly recommended SAS for “banks that want to introduce as much automation to the process as possible and aggressively seek analytic benefits that go well beyond compliance.”

SAS approaches stress testing by providing software and services that help customers across three key areas:

  • Data Management: Our risk-specific data model and datamart ensures consistent use of data and scenarios as a foundational source of truth across banking and trading books. Transaction-level data, bank data and third party data is captured and integrated for modelling across credit risk, market risk, regulatory and economic capital.
  • Modelling: SAS Risk Dimensions, a centralized risk engine, ensures that factor analysis, model execution and outputs are captured in a single location. The engine performs stress tests, including Monte Carlo derived reverse stress tests, at portfolio and enterprise wide levels.
  • Reporting:  SAS provides a wide variety of capabilities for aggregation and producing consolidated, reconciled reporting and analytics at any level of granularity. Customers regularly highlight the superior auditability of SAS reports and the extensive documentation of all changes to critical assets such as data, models and scenarios.

With the right data and enabling software, SAS customers can simulate different environmental conditions to understand their effect on the financial position of their bank. Armed with insights from these tests, they can better deliver sustainable profitability and growth, even in challenging business conditions. The focus on business-driven scenarios couldn’t be timelier as the Australian industry anticipates a squeeze on performance due to higher capital requirements and slower system growth coupled by the accompanying need to further differentiate themselves from competitors.

Learn More

Stress testing for compliance is important and should be completed as efficiently as possible – but it’s not sufficient for true risk management. Banks need business-specific stress tests to make informed business decisions and to successfully walk the tightrope between opportunity and risk. This means extending stress testing processes beyond mere compliance and taking an enterprise approach to risk management.

With SAS, banks automate data management, repetitive tasks and compliance work to free highly skilled team members to focus on the creative and engaging work of strategy, planning and execution. Adoption of this approach will continue as the banking industry continues to increase focus on risk-adjusted performance and risk-influenced decision making.

Learn more about how banks are using comprehensive, enterprise wide stress testing to enable better risk and profitability management in the SAS paper “Comprehensive Stress Testing: A Regulatory and Economic Perspective.”

Download the paper at: http://www.sas.com/en_us/whitepapers/comprehensive-stress-testing-106958.html

Post a Comment

Three Things You Should Know About SAS and Hadoop

Hadoop2I have been on a whirlwind tour locally here in Australia visiting existing SAS customers where the focus of discussions have centered around SAS and Hadoop. I am happy to report that during these discussions, customers have been consistently surprised and excited about what we are doing around SAS on Hadoop! Three things in particular stood out and have resonated well with our wonderful SAS users community that I thought I share them here for the benefit of the broader community.

1. All SAS products are Hadoop enabled today

Whilst some of our newer products such as Visual Analytics and In-Memory Statistics for Hadoop were built from day one with Hadoop in mind, you might not be aware that in fact all of our current SAS products have been Hadoop enabled and can take advantage of Hadoop today.

Our mature and robust SAS/Access interface to Hadoop technology allows SAS users today to easily connect to Hadoop data sources using any SAS applications. A key point here is being able to do this without having to understand any of the underlying technology or write a single line of MapReduce code. Furthermore, the SAS/Access interface for Hadoop has been optimised and can push SAS procedures into Hadoop for execution, thereby allowing developers to tap into the power of Hadoop and improving the performance of basic SAS operations.

2. SAS does Analytics in Hadoop

The SAS R&D team have worked extremely hard with our Hadoop distribution partners to take full advantage of the powerful technologies within the Hadoop ecosystem. We are driving integration deep into the heart of the Hadoop ecosystem with technologies such as HDFS, Hive, MapReduce, Pig and YARN.

The SAS users I have been speaking to have been pleasantly surprised by the depth of our integration with Hadoop and excited about what it means for them as end users. Whether it’s running analytics in our high performance in-memory servers within a Hadoop cluster or pushing analytics workload deep into the Hadoop environment, SAS is giving users the power and flexibility in deciding where and how they want to run their SAS workloads.

This point was powerfully made by none other than one of the co-founder of Hortonworks in his recent blog post and I couldn’t have phrased his comment better myself!

“Integrating SAS HPA and LASR with Apache Hadoop YARN provides tremendous benefits to customers using SAS products and Hadoop. It is a great example of the tremendous openness and vision shown by SAS”

3. Organisations are benefiting from SAS on Hadoop today

With Hadoop being THE new kid on the block, you might be wondering if there are any customers that are already taking advantage of SAS and Hadoop now. One such customer is Rogers Media – They’ve been doing some pretty cool stuff with SAS and Hadoop to drive real business value and outcomes!

In a chat with Dr. Goodnight during SAS Global Forum this year, Chris Dingle from Rogers Media shared how they are using SAS and Hadoop to better understand their audience. I was fortunate enough to be there in person myself, and I must say the keynote session on Hadoop and Rogers Media was a highlight for many people there and definitely got the masses thinking what they should be doing around SAS and Hadoop. For those of you who are interested in more details, here is a recap of the presentation explaining the SAS/Hortonworks integration as well as more details on the Rogers Media case study.

We are working with a number of organisations around the world on exciting SAS on Hadoop projects so watch this space!

All in all, it’s a great time to be a SAS user and it has never been easier to take advantage of the power of Hadoop as a SAS user. I encourage you find out more, reach out to us or leave comments here as we would love to hear about how you plan to leverage the power of SAS and Hadoop!

Post a Comment

The Importance of Being a Data Scientist

This post is a nod to one of my favourite plays, The Importance of Being Earnest by Oscar Wilde. As the title ‘Data Scientist’ becomes more common, what can we gain about the importance of titles and labelling from this century old play?

For those that haven't read it, the story revolves around a man who goes by the name Earnest and has a reputation for being earnest (i.e. truthful and trustworthy). He is much loved by a lady for having the name Earnest - she has always wanted to marry a man with that name, believing men named Earnest are earnest (deep breath).

Well it turns out his name isn't really Earnest (irony 1 - he's not actually earnest despite his name) and the lady considers dumping him, but by a comic twist of fate it turns out that it actually is Earnest (irony 2 - he actually was earnest even though he didn’t think he was).

So what is the importance of being <insert a name or title>? As another wit once said "a rose by any other name would smell as sweet". But is that true? Our experiences have probably told us "No". Despite whatever skills we may have, a title comes with a reputation and expectations. Whether that’s someone named Earnest actually being earnest … or a Data Scientist being a magician with big data.

Data_ScientistThere have been many attempts at explaining what a data scientist is since the term was first coined in 2008 – Wikipedia, HBR, KDnuggets, Marketing Distillery – but the general definition is someone who encompasses equally high skill levels in:

a. Statistics.
b. “Hacker” programming.
c. Communication.
d. Business.

The number of people that actually satisfy this definition is a popular subject in discussion forums and papers, and it’s interesting to also ask from what perspective the attributes are judged (good communication skills from a marketer are expected to be different than good from a programmer). But what everyone agrees is that many Data Analysts, Data Miners, Statisticians, Econometricians and the myriad of other titles over the last 50 years, all have these attributes, but in varying proportions.

I've met many analytics practitioners over the years from different parts of the world. Some quantitative analysts have chosen to change their titles to Data Scientist to make them more attractive to employers as the Statistician and Data Miner titles go out of favour. Some, who are very close to the purist definition of data scientist, may not title themselves as such, adamantly sticking with the title they have had for many years.

On the other hand, in many cases, employers who advertise for Data Scientists are actually looking for:

  • Quantitative analysts with innate curiosity to learn and innovate – a trait of most people from mathematics, sciences, engineering and economics disciplines.
  • Candidates who meet some minimum criteria in the four attributes – many of which can be taught.
  • Those who are strong in a subset of prioritized attributes to suit a function within a team.

Magnifying_PeopleOver the years, given the right drivers, these partially defined Data Scientists could become strictly defined Data Scientists – but in a collaborative team environment, you will likely find that having a whole team of these individuals is not important. The two realities are that there are far fewer examples of organizations looking for the latter than the former, and these tend to be for commercial research and development arms; and individuals that embody all the attributes of a Data Scientist in the “right” amounts are rare.

Therefore, for those looking to hire Data Scientists, my advice is:

  • It’s much more important to first start with understanding the functions and expectations of the team within an organisation.
  • Then create roles that fit the needs of that team and “be much more specific about the type of worker you want to be or hire” (Tom Davenport, Wall Street Journal).
  • Be realistic of current skills in the market and tertiary education programs available.

For those looking for a role as a Data Scientist:

  • Start developing the attributes you are weakest at through classroom and self-service training because all round skills are always sought after.
  • Keep developing the attributes you already excel at as the big data analytics market is constantly evolving.
  • Stay curious of new techniques and worldwide trends.

So, the importance of being a Data Scientist is to be more attractive to employers, but that what employers are usually looking for is some flavour thereof, rather than a strictly defined criteria. Even though a prospect may not be the purest definition of a Data Scientist, he or she may turn out to be just what an organization needs. Therefore, be sure of what you want to be and what you’re looking for and don't judge prospects and opportunities based on a title… lest you end up dumping your Earnest before he becomes an Earnest!

Learn more. Stay curious.


Post a Comment

Taking advantage of the analytics opportunity

The rise of analytics and big data presents a once-in-a-generation opportunity for organisations to put themselves at the cutting edge, to create a competitive advantage by developing a culture of analytical success within their organisation. Yet most seem unable to grasp the opportunity that is within their reach.

Eric Hoffer wrote that "In times of change, learners inherit the Earth, while the learned find themselves beautifully equipped to deal with a world that no longer exists." So why in this time of rapid change are our organisations failing to invest in educating their most important asset - their employees?

Organisations should be investing time and money into programs and courses to supercharge their staff for success with analytics in business.

With the recent advances in analytical programs and training, many staff will be new to the latest education. Even those with relevant education may not have had the chance to complement their knowledge with industry experience, as a result they may be missing essential domain knowledge and communication skills.

Three things you can do to invest in your organisation’s biggest asset and secure a competitive advantage:

  1. Involve staff in classroom training courses and workshops available that provide the skills and experience needed to succeed in a rapidly changing environment. Couple this with professional accreditations where they are available.
  2. If struggling for initial budget to invest, access complimentary training to demonstrate the value created. For example, at SAS we have made our Programming 1 and Statistics 1 eLearning courses available at no cost, accessible online. Showing the ROI through investing time to develop skills can build support for investment in further education.
  3. Look out for some of the full degree University programs being developed around Data Science, Business Analytics and Advanced Analytics. Integrate these into individual’s professional development plans.

Interested in learning more about education programs? Please leave a comment below or tweet me @james_enoch


It doesn’t matter how many resources you have. If you don’t know how to use them it will never be enough.

Post a Comment
  • About this blog

    The Asia Pacific region provides a unique set of challenges (ความท้าทาย, cabaran) and opportunities (peluang, 机会). Our diverse culture, rapid technology adoption and positive market has our region poised for great things. One thing we have in common with the rest of the world is the need to be globally competitive while staying locally relevant. On this blog, our key regional thought leaders provide an Asia Pacific perspective on doing business, using analytics to be more effective, and life left of the date line.
  • Subscribe to this blog

    Enter your email address:

    Other subscription options