The Relevance of Data Management in the Era of Big Data

text globeData Management has been the foundational building block supporting major business analytics initiatives from day one. Not only is it highly relevant, it is absolutely critical to the success of all business analytics projects.

Emerging big data platforms such as Hadoop and in-memory databases are disrupting traditional data architecture in the way organisations store and manage data. Furthermore, new techniques such as schema on-read and persistent in-memory data store are changing how organisations deliver data and drive the analytical life cycle.

This brings us to the question of how relevant data management is in the era of big data? At SAS, we believe that data management will continue to be the critical link between traditional data sources, big data platforms and powerful analytics. There is no doubt that the WHERE and HOW big data will be stored will change and evolve overtime. However that doesn’t affect the need for big data to be subject to the same quality and control requirements as traditional data sources.

Fundamentally, big data cannot be used effectively without proper data management

Data Integration

Data has always been more valuable and powerful when it is integrated and this will remain to be true in the era of big data.

It is a well known fact that whilst Hadoop is being used as a powerful data storage repository for high volume, unstructured or semi-structure information, most corporate data are still locked in traditional RDBMs or data warehouse appliances. The true value of weblog traffic or meter data stored in Hadoop can only be unleashed when they are linked and integrated with customer profile and transaction data that are stored in existing applications.  The integration of high volume, semi-structured big data with legacy transaction data will provide powerful business insights that can be game changing.

Data has always been more valuable and powerful when it is integrated and this will continue to be true in the era of big data.

Big data platforms provide an alternative source of data within an organisation’s enterprise data architecture today, and therefore must be part of an organization integration capability.

Data Quality

Just because data lives and comes from a new data source and platform doesn’t mean high levels of quality and accuracy can be assumed. In fact, Hadoop data is known to be notoriously poor in terms of its quality and structure simply because of the lack of control and ease of how data can get into a Hadoop environment.

Just like traditional data sources, before raw Hadoop data can be used, it needs to be profiled and analysed. Often issues such as non-standardised fields and missing data become glaringly obvious when analysts try to tap into Hadoop data sources. Automated data cleansing and enrichment capabilities within the big data environment are critical to make the data more relevant, valuable and most importantly, trustworthy.

As Hadoop gains momentum as a general purpose data repository, there will be increasing pressure to adopt traditional data quality processes and best pracrices.

Data Governance

It should come as no surprise that policies and practices around data governance will need to be applied to new big data sources and platforms. The requirements of storing and manage metadata, understanding lineage and implementing data stewardship do not go away simply because the data storage mechanism has changed.

Furthermore, the unique nature of Hadoop as a highly agile and flexible data repository also brings new challenges around privacy and security around how data needs to be managed, protected and shared. Data Governance will play an increasingly important role in the era of big data as the need to better align IT and business increases.

Data Governance will play an increasingly important role in the era of big data as the need to better align IT and business increases

Whilst the technology underpinning how organisations store their data is going through tremendous change, the need to integrate, govern and manage the data itself have not changed. If anything, the changes to the data landscape and the increase in types and forms of data repositories will make the tasks around data management more challenging than ever.

SAS recognises the challenge faced by our customers and has continued to investment in our extensive Data Management product portfolio by embracing big data platforms from leading vendors such as Cloudera and Hortonworks as well as supporting new data architecture and data management approaches.

As this recent NY Times article appropriately called out, a robust and automated data management platform within a big data environment is critical to empower data scientists and analyst so that they can be freed from doing “Data Janitor” work and focus on the high value activities.

Post a Comment

Are you ready to graduate from spreadsheets to analytics?

Flexibility and nimbleness are terms synonymous with small and mid-sized businesses. They have the ability to react quickly to changing market conditions as they are made aware of them. Traditionally these businesses have lived in the world of spreadsheets - and why not? They are easy to use, very affordable and readily available to all staff across the business. However increasingly, they are realising there are a wealth of insights hidden within their data that once uncovered, can offer them a first-mover advantage and the opportunity to capitalise and stay ahead of the game.

IT departments of one

Most organisations of this size run a very lean IT team which requires finding those rare expertly-general skilled professionals to run everything from setting up computers to managing networks and dealing with internet security issues. Often these small teams do not have the bandwidth or desire to also become analytics experts. With our new generation of reporting and analytics tools your IT team does not need any analytics or programming skills as the creation of reports, dashboards and analytics are kept within the hands of the analyser.

IMG_5694 copyFootball NSW is a not-for-profit organisation that looks after 208,000 registered football players across the state. It employs 57 staff, one of which forms the entirety of its IT department – focusing primarily on desktop support. ‘Analytics’ was a meaningless term for them a year ago, before they introduced SAS® Visual Analytics to replace their spreadsheets. Their reason for turning to a full reporting and analytics tool was clear from the start, they had one question to answer:

“How do we use our data to better engage to our stakeholders – whether it’s councils, government, sponsors or member clubs.”

With that goal front and centre their small organisation is now using powerful visualisations to attract and retain participants focusing on the three F’s – football, facilities and finance. Answering new questions with their data such as:

  1. What do future numbers of football players look like and will councils have the facilities to cater for them?
  2. How do we provide our sponsors with information that provides value to them so they stay with us?
  3. What is the profile of our typical referee and how do we educate and retain them?

You might be thinking, yeah but how does a Football club relate to me and my organisation? The principle behind all of this remains the same; analytics is not just for the big guys, in fact small and mid-sized organisations can easily use analytics to discover insights without the need for specialist skills. In fact you don’t even need to purchase extra hardware as we enter the age of Cloud Analytics.

Start by dodging the Buzz-word bingo

Business is buzzing with terms such as ‘big data’ ‘industrial internet’ and ‘advanced analytics’. Companies are talking about needing to hire ‘data scientists’ and having ‘machine to machine’ conversations, but for most organisations the question of where to start does not involve any of these terms.

The best starting point for most businesses embarking on an analytics journey is to get back to basics by better understanding their internal data… For the average business, data is all over the place. It can be found in different applications (finance, HR etc) some of which may be sitting in the cloud, or in dusty places such as archives, storage devices or spreadsheets that have been buried deep within your filing systems. Identifying and bringing all of this data together in a ‘single version of the truth’ is the foundation for gaining deeper insights, more accurate reporting and improved confidence in your data. It’s critical when you’re faced with this environment to ensure you seek a solution that not only consolidates and standardises data to build an integrated data view but then allows you to tell a story that looks both to the past and helps hypothesise about the future.

You do need to start with a clear attainable goal in mind, and it doesn’t need to include ‘saving the world’ at step one. Ensure your objective will enable you to either show value quickly (payback value) or achieve something which will have widespread visibility within the business (an issue that no one has been able to solve, or a way of using data to look at a falling market in a different way for example).

The world is rapidly changing. The value of managing data as an asset is now becoming a topic for most boardroom conversations. SAS Visual Analytics for the Cloud gives small to mid-market businesses the ability not only to have those exact same conversations but to act on them immediately. Analytics is no longer just for the large banks or government departments, it’s an option everyone can now capitalise on, and those who are flexible and nimble have the most to gain.

Post a Comment

Top 20 Procs in SAS University Edition

SAS University Edition has been available for free download for six months – in that time we’ve seen 50,192,670 PROCs or DATA steps executed globally – that’s almost 4,000 hours duration!

Now, we were founded on stats so we thought we’d bring you some of the key metrics we’ve discovered over the past six months.

  • 50,192,670 PROCs/DATA steps executed
  • 13,634,561 seconds of duration for all PROCs (3,787.38 hours)
  • 40,403 unique systems have registered
  • 22,629 unique systems that have reported usage (56% of all registered)

SAS Analytics UTop 20 PROCs executed

Did your favourite PROC make it into the list? Can't see it? Ask us in the comments below – we have more data! Or you can head on over to the SAS Analytics U online community to discuss.

Obs. Name Number executed Sum of time spent executing
1 DATA Step 34,403,229 2,699,397
2 SQL 4,482,278 709,635
3 SORT 3,533,974 412,108
4 PRINT 1,561,451 1,753,789
5 MEANS 1,129,950 176,810
6 APPEND 475,796 39,004
7 PRINTTO 473,741 31,454
8 DATASETS 433,945 89,520
9 FREQ 340,295 410,420
10 TRANSPOSE 304,311 9,314
11 ROBUSTREG 300,183 171,981
12 IMPORT 279,215 365,478
13 MIXED 226,042 763,875
14 PLOT 206,895 11,386
15 UNIVARIATE 166,502 729,890
16 REG 154,523 1,372,852
17 CONTENTS 152,598 67,762
18 SGPLOT 148,290 388,157
19 SUMMARY 121,574 14,833
20 CORR 107,992 77,494

SAS University EditionSAS University Edition - Easy to access, easy to use.

Download and install the software yourself – no need to go through convoluted channels for software distribution. And it’s portable, so it goes wherever you go. Once you download it, you don't even need an Internet connection to use it. Writing and submitting code is easy (no, really!) thanks to a powerful graphical interface that provides point-and-click access to advanced statistical analysis software, no matter what level you're at – from introductory statistics to higher-level analytics classes.

Need some tutorials or training to get started? There are over 180 Tutorials available for free at your fingertips, while the SAS Programming 1 and SAS Statistics 1 eLearning courses are also available at no charge to get you going. SAS Education also has a wide variety of training courses to further your knowledge where needed.

Already a user? Tell us your experience with SAS University Edition in the comments below.

Post a Comment

Value from Business-Driven Stress Testing

Going Beyond Regulatory-Mandated Tests to Achieve True Risk Management

I regularly hear banking customers talk about ‘sweating their assets’ - leveraging their substantial investments in expanded teams of risk analysts, re-engineered processes and new risk systems for Basel II and III compliance – to gain better insights into their business.

In looking at the approaches taken here in Australia, I think it’s fair to say that most organisations recognise that risk management and stress testing – the latter is a topic of particular customer interest this year - are critical to making informed business decisions. There is a lot of valuable data and information available in risk systems that remains untapped by the broader business. On the stress testing front, most banks have only been able to focus on getting the tests across the line - doing much more has proved difficult due to the incredible effort required to coordinate the iterative process of testing across the enterprise’s businesses and systems.

Business Driven Stress Testing

Business Driven Stress Testing

After customers wrapped up stress tests earlier this year, there has been considerable discussion about improving the process through greater automation and moving beyond the regulators’ mandated tests by running additional business-driven scenarios. The goal is to apply the bank’s unique points of view in regards to the forecasted business environment - economic outlook, competitive strategy, capital raising activities and risk appetite for example – to better understand the tradeoffs between opportunity and risk. Many finance and risk practitioners, including myself, see this as the start of a period of greater focus on measuring risk-adjusted performance and making more risk-aware decisions.

In response, several Australian banks are increasing the scope of responsibilities, seniority and overall visibility of the committees and teams responsible for stress testing. This not only satisfies the governance expectations of regulators but will also increase the value derived from the enterprise wide planning process as a result of higher levels of collaboration and integration across strategy, finance and risk functions.

What’s Held Banks Back?

As a testament to the limited role stress testing has played in decision making, I recently reviewed a draft report based on a survey of banks in the US and Europe which highlighted that just 24 percent of respondents acknowledged making changes to their strategic decisions as a result of stress testing.

So why haven’t banks expanded their use of stress testing sooner?

  • Maturity: Many banks are still in learning mode when it comes to stress testing. This doesn’t solely apply to banks as regulatory authorities are also refining their approach based on what we learn from conducting more tests each year.
  • Complexity: Stress testing is no easy task when you consider the number of markets, operating units, products and customer segments served by a typical bank. Getting the required input from scores of people and systems across the enterprise is often characterised as herding cats.
  • Resources: It takes an incredible amount of time, people and resources to complete a round of the mandated stress tests, leaving few resources available for what is often seen as optional business-driven testing. This is compounded by a skill shortage that is only expected to get worse.
  • Data: Systems have been built in silos over many years and integration of the data required for stress testing has proven to be painful. Data quality issues are compounding the problem and has led APRA and global regulators to intervene with guidance and standards such as CPG 235 and BCBS 239.
  • Change: Keeping up with regulatory changes further restricts capacity to move beyond mere compliance. Banks hire staff, change systems, build capabilities and get good at delivery only to find that the requirements have changed.
  • Engagement: Getting boards and management excited about a new business-driven approach will take time. Executives have not found use for the mandated stress tests which tended to focus on systemic risk and overly simplistic models instead of the bank’s unique strategy, plans and economic conditions.

SAS and Stress Testing Automation

Anyone who has spoken with me about stress testing will know that I get excited about sharing how customers are using SAS stress testing capabilities as a modern management tool. We excel in this space and have enjoyed public recognition of our solutions – most recently by AITE, an independent research and advisory firm known for its finance and risk systems expertise. In a crowded field of well-known vendors, AITE rated SAS a stress testing leader and particularly recommended SAS for “banks that want to introduce as much automation to the process as possible and aggressively seek analytic benefits that go well beyond compliance.”

SAS approaches stress testing by providing software and services that help customers across three key areas:

  • Data Management: Our risk-specific data model and datamart ensures consistent use of data and scenarios as a foundational source of truth across banking and trading books. Transaction-level data, bank data and third party data is captured and integrated for modelling across credit risk, market risk, regulatory and economic capital.
  • Modelling: SAS Risk Dimensions, a centralized risk engine, ensures that factor analysis, model execution and outputs are captured in a single location. The engine performs stress tests, including Monte Carlo derived reverse stress tests, at portfolio and enterprise wide levels.
  • Reporting:  SAS provides a wide variety of capabilities for aggregation and producing consolidated, reconciled reporting and analytics at any level of granularity. Customers regularly highlight the superior auditability of SAS reports and the extensive documentation of all changes to critical assets such as data, models and scenarios.

With the right data and enabling software, SAS customers can simulate different environmental conditions to understand their effect on the financial position of their bank. Armed with insights from these tests, they can better deliver sustainable profitability and growth, even in challenging business conditions. The focus on business-driven scenarios couldn’t be timelier as the Australian industry anticipates a squeeze on performance due to higher capital requirements and slower system growth coupled by the accompanying need to further differentiate themselves from competitors.

Learn More

Stress testing for compliance is important and should be completed as efficiently as possible – but it’s not sufficient for true risk management. Banks need business-specific stress tests to make informed business decisions and to successfully walk the tightrope between opportunity and risk. This means extending stress testing processes beyond mere compliance and taking an enterprise approach to risk management.

With SAS, banks automate data management, repetitive tasks and compliance work to free highly skilled team members to focus on the creative and engaging work of strategy, planning and execution. Adoption of this approach will continue as the banking industry continues to increase focus on risk-adjusted performance and risk-influenced decision making.

Learn more about how banks are using comprehensive, enterprise wide stress testing to enable better risk and profitability management in the SAS paper “Comprehensive Stress Testing: A Regulatory and Economic Perspective.”

Download the paper at:

Post a Comment

Three Things You Should Know About SAS and Hadoop

Hadoop2I have been on a whirlwind tour locally here in Australia visiting existing SAS customers where the focus of discussions have centered around SAS and Hadoop. I am happy to report that during these discussions, customers have been consistently surprised and excited about what we are doing around SAS on Hadoop! Three things in particular stood out and have resonated well with our wonderful SAS users community that I thought I share them here for the benefit of the broader community.

1. All SAS products are Hadoop enabled today

Whilst some of our newer products such as Visual Analytics and In-Memory Statistics for Hadoop were built from day one with Hadoop in mind, you might not be aware that in fact all of our current SAS products have been Hadoop enabled and can take advantage of Hadoop today.

Our mature and robust SAS/Access interface to Hadoop technology allows SAS users today to easily connect to Hadoop data sources using any SAS applications. A key point here is being able to do this without having to understand any of the underlying technology or write a single line of MapReduce code. Furthermore, the SAS/Access interface for Hadoop has been optimised and can push SAS procedures into Hadoop for execution, thereby allowing developers to tap into the power of Hadoop and improving the performance of basic SAS operations.

2. SAS does Analytics in Hadoop

The SAS R&D team have worked extremely hard with our Hadoop distribution partners to take full advantage of the powerful technologies within the Hadoop ecosystem. We are driving integration deep into the heart of the Hadoop ecosystem with technologies such as HDFS, Hive, MapReduce, Pig and YARN.

The SAS users I have been speaking to have been pleasantly surprised by the depth of our integration with Hadoop and excited about what it means for them as end users. Whether it’s running analytics in our high performance in-memory servers within a Hadoop cluster or pushing analytics workload deep into the Hadoop environment, SAS is giving users the power and flexibility in deciding where and how they want to run their SAS workloads.

This point was powerfully made by none other than one of the co-founder of Hortonworks in his recent blog post and I couldn’t have phrased his comment better myself!

“Integrating SAS HPA and LASR with Apache Hadoop YARN provides tremendous benefits to customers using SAS products and Hadoop. It is a great example of the tremendous openness and vision shown by SAS”

3. Organisations are benefiting from SAS on Hadoop today

With Hadoop being THE new kid on the block, you might be wondering if there are any customers that are already taking advantage of SAS and Hadoop now. One such customer is Rogers Media – They’ve been doing some pretty cool stuff with SAS and Hadoop to drive real business value and outcomes!

In a chat with Dr. Goodnight during SAS Global Forum this year, Chris Dingle from Rogers Media shared how they are using SAS and Hadoop to better understand their audience. I was fortunate enough to be there in person myself, and I must say the keynote session on Hadoop and Rogers Media was a highlight for many people there and definitely got the masses thinking what they should be doing around SAS and Hadoop. For those of you who are interested in more details, here is a recap of the presentation explaining the SAS/Hortonworks integration as well as more details on the Rogers Media case study.

We are working with a number of organisations around the world on exciting SAS on Hadoop projects so watch this space!

All in all, it’s a great time to be a SAS user and it has never been easier to take advantage of the power of Hadoop as a SAS user. I encourage you find out more, reach out to us or leave comments here as we would love to hear about how you plan to leverage the power of SAS and Hadoop!

Post a Comment

The Importance of Being a Data Scientist

This post is a nod to one of my favourite plays, The Importance of Being Earnest by Oscar Wilde. As the title ‘Data Scientist’ becomes more common, what can we gain about the importance of titles and labelling from this century old play?

For those that haven't read it, the story revolves around a man who goes by the name Earnest and has a reputation for being earnest (i.e. truthful and trustworthy). He is much loved by a lady for having the name Earnest - she has always wanted to marry a man with that name, believing men named Earnest are earnest (deep breath).

Well it turns out his name isn't really Earnest (irony 1 - he's not actually earnest despite his name) and the lady considers dumping him, but by a comic twist of fate it turns out that it actually is Earnest (irony 2 - he actually was earnest even though he didn’t think he was).

So what is the importance of being <insert a name or title>? As another wit once said "a rose by any other name would smell as sweet". But is that true? Our experiences have probably told us "No". Despite whatever skills we may have, a title comes with a reputation and expectations. Whether that’s someone named Earnest actually being earnest … or a Data Scientist being a magician with big data.

Data_ScientistThere have been many attempts at explaining what a data scientist is since the term was first coined in 2008 – Wikipedia, HBR, KDnuggets, Marketing Distillery – but the general definition is someone who encompasses equally high skill levels in:

a. Statistics.
b. “Hacker” programming.
c. Communication.
d. Business.

The number of people that actually satisfy this definition is a popular subject in discussion forums and papers, and it’s interesting to also ask from what perspective the attributes are judged (good communication skills from a marketer are expected to be different than good from a programmer). But what everyone agrees is that many Data Analysts, Data Miners, Statisticians, Econometricians and the myriad of other titles over the last 50 years, all have these attributes, but in varying proportions.

I've met many analytics practitioners over the years from different parts of the world. Some quantitative analysts have chosen to change their titles to Data Scientist to make them more attractive to employers as the Statistician and Data Miner titles go out of favour. Some, who are very close to the purist definition of data scientist, may not title themselves as such, adamantly sticking with the title they have had for many years.

On the other hand, in many cases, employers who advertise for Data Scientists are actually looking for:

  • Quantitative analysts with innate curiosity to learn and innovate – a trait of most people from mathematics, sciences, engineering and economics disciplines.
  • Candidates who meet some minimum criteria in the four attributes – many of which can be taught.
  • Those who are strong in a subset of prioritized attributes to suit a function within a team.

Magnifying_PeopleOver the years, given the right drivers, these partially defined Data Scientists could become strictly defined Data Scientists – but in a collaborative team environment, you will likely find that having a whole team of these individuals is not important. The two realities are that there are far fewer examples of organizations looking for the latter than the former, and these tend to be for commercial research and development arms; and individuals that embody all the attributes of a Data Scientist in the “right” amounts are rare.

Therefore, for those looking to hire Data Scientists, my advice is:

  • It’s much more important to first start with understanding the functions and expectations of the team within an organisation.
  • Then create roles that fit the needs of that team and “be much more specific about the type of worker you want to be or hire” (Tom Davenport, Wall Street Journal).
  • Be realistic of current skills in the market and tertiary education programs available.

For those looking for a role as a Data Scientist:

  • Start developing the attributes you are weakest at through classroom and self-service training because all round skills are always sought after.
  • Keep developing the attributes you already excel at as the big data analytics market is constantly evolving.
  • Stay curious of new techniques and worldwide trends.

So, the importance of being a Data Scientist is to be more attractive to employers, but that what employers are usually looking for is some flavour thereof, rather than a strictly defined criteria. Even though a prospect may not be the purest definition of a Data Scientist, he or she may turn out to be just what an organization needs. Therefore, be sure of what you want to be and what you’re looking for and don't judge prospects and opportunities based on a title… lest you end up dumping your Earnest before he becomes an Earnest!

Learn more. Stay curious.


Post a Comment

Taking advantage of the analytics opportunity

The rise of analytics and big data presents a once-in-a-generation opportunity for organisations to put themselves at the cutting edge, to create a competitive advantage by developing a culture of analytical success within their organisation. Yet most seem unable to grasp the opportunity that is within their reach.

Eric Hoffer wrote that "In times of change, learners inherit the Earth, while the learned find themselves beautifully equipped to deal with a world that no longer exists." So why in this time of rapid change are our organisations failing to invest in educating their most important asset - their employees?

Organisations should be investing time and money into programs and courses to supercharge their staff for success with analytics in business.

With the recent advances in analytical programs and training, many staff will be new to the latest education. Even those with relevant education may not have had the chance to complement their knowledge with industry experience, as a result they may be missing essential domain knowledge and communication skills.

Three things you can do to invest in your organisation’s biggest asset and secure a competitive advantage:

  1. Involve staff in classroom training courses and workshops available that provide the skills and experience needed to succeed in a rapidly changing environment. Couple this with professional accreditations where they are available.
  2. If struggling for initial budget to invest, access complimentary training to demonstrate the value created. For example, at SAS we have made our Programming 1 and Statistics 1 eLearning courses available at no cost, accessible online. Showing the ROI through investing time to develop skills can build support for investment in further education.
  3. Look out for some of the full degree University programs being developed around Data Science, Business Analytics and Advanced Analytics. Integrate these into individual’s professional development plans.

Interested in learning more about education programs? Please leave a comment below or tweet me @james_enoch


It doesn’t matter how many resources you have. If you don’t know how to use them it will never be enough.

Post a Comment

Top 5 skills you need when applying for a Data Scientist role

In guest lectures I give at universities, I often refer to the Harvard Business Review report which states that being a Data Scientist is the sexiest job of the 21st century. Naturally, this always seems to capture the students’ attention, and drives their enthusiasm to sit up and listen carefully. As a result, one question I am consistently asked by inquisitive students is, “What skills do I need to become a Data Scientist?”

In my experience with solving analytical problems and conversations with customer looking to hire their next Data Scientist, I have drawn out the 5 most sought after skills you need to consider when applying for a Data Scientist role. Here are my top 5:

5. Know how to develop a predictive model using regressions and decision trees

This doesn’t sound too sophisticated to a pure statistician, I know, but businesses want to know the best outcome for a particular event. The most common business questions asked are, “Which customers are most likely to leave? Which customers will take up a product? Which customers should I approve for a loan?” In most cases a regression or decision tree results in an easy to explain predictive model to address these questions. More importantly, they are easy to productionise to meet the organisation’s demands.

This is a data science skill that contributes to companies creating a more targeted customer experience to increase profits while reducing marketing spend – a fantastic result for organisations!

4. Know how to develop a segmentation model

The first thing organisations want to do with their database is understand the characteristics displayed by their customers. And of course, from a marketing perspective, they want to know what groups of customers look like and what makes the groups different. Applying the skill of clustering (and there are many different kinds in this discipline), to obtain cohorts or segments that are distinct, is extremely valuable in driving successful business outcomes. It may be one of the first tasks you are asked to perform in a Data Scientist role.

3. Know how to use SAS with R

It is no secret that R is the common tool of choice for many students graduating from Information Management courses. But the reality is, when you need to apply analytics to commercial corporate data that is often exponential in growth, you must be able to incorporate R skills with SAS skills. Organisations have invested in analytical enterprise platforms that are often already embedded successfully into the organisation's model lifecycle environment. So, those who bring a variety of both SAS and R skills will make valued Data Scientists.

2. Know how to access relevant data quickly

About 80% of a Data Scientist's work is focussed on knowing where the appropriate data is housed and how to access relevant data quickly. In my experience at one particular company I worked at, I developed a data dictionary for every data source I needed to access. The data dictionary was like the Holy Grail to fast and accurate data extraction, and let me get on with the science of deriving insights from the data. Everyone wanted a copy of the data dictionary… even IT!

1. Know how to articulate your analytical results to drive business outcomes

Communication is critical to your success. I often practice communicating the business outcomes of my analytical results with colleagues that aren’t analytically inclined. Once I know that they understand the value of the results from a business context, only then am I satisfied that my analytical results are actually useful to my organisation. You need to move away from statistical jargon and be creative with how you illustrate data and models in a business friendly manner. Do it in a fashion that is easy on the eyes and ears – you don’t want to scare people away, but rather welcome them into the journey of business analytics.

The tips described above aren’t rocket science. However, they’re not something you develop overnight either. If you can practice these skills continuously, and keep them at the front of your mind when applying for a Data Scientist role, then you will be a step ahead of your competition. Success awaits!

All opinions are my own, and based on my experience, conversations and feedback from the professional field and customers looking to hire quality Data Scientists.

Post a Comment

Five Keys to Successful Stress Testing

Stress testing is not new to the risk world but has been a major focus since the GFC (Global Financial Crisis). For a number of years now, stress testing has helped analytical specialists quantify various aspects of potential loss. What is new is the introduction of regulatory stress tests which has added to the work load that banks face. Banks are also faced with increasing complexity, increasing frequency and the firm-wide nature of regulatory and economic scenarios that are required to be tested. So stress testing programs are placing increasing demands across an organisation’s people, processes and IT systems – particularly banks and now our insurance companies in Australia. This change in scope has introduced several challenges. On the technology side, the challenges are mainly due to the scale and complexity of the underlying data required to build out the scenarios and stress tests. This is stretching the capabilities of existing computing resources to deliver timely responses.

Our banking and insurance customers are very focused on stress testing. In fact,  I engaged with a number of Australian customers this year to explore stress testing automation soon after teams spent weeks of late nights and weekends completing the first round of APRA’s annual stress test.

Our discussions uncovered a common set of challenges across institutions:

  • Limited stress-testing framework: Stress testing has historically been an isolated exercise by a bank’s risk function, different stress events and assumptions are used in each model and outputs cannot be easily aggregated into a meaningful, combined result.
  • Lack of granularity: Today’s systems are often incapable of providing the granularity needed in each asset class at the individual position or facility level.
  • Insufficient coverage: Disparate data in large quantities prohibits a consolidated view of all assets. This is further exasperated when trying to bring risk and finance data together.
  • Inconsistency: Multiple versions of data, valuation methods and models persist, making it practically impossible to achieve consistency among calculations and measures.
  • Organisational silos: While many trading desks and risk groups conduct stress tests to supplement their risk analyses, they are done in silos by individual analysts, making it impossible to verify that assumptions are consistent.
  • Coordination across functions: Consolidating and analysing the interdependencies across strategy, risk, budgeting and treasury activities adds even greater complexity.

Different functions across multiple business lines and geographies may participate in strategy formulation, stress testing and capital planning.

Based on recent work with customers, leading consulting companies and auditors, SAS identified the following key focus areas required to deliver a successful stress testing program:

  • Efficiency: Integration of existing risk models and data hierarchies into a streamlined data infrastructure for firm-wide stress testing. Data, computations, reporting –all must be part of a unified platform.
  • Performance: The efficient aggregation of results for all major risk models across the organisation with the ability to run complex, forward looking stress tests with multiple parameters.
  • Enterprise View: Views of economic capital and pro forma financials at the enterprise level with a view of market, credit and liquidity risk. Leading practice is to extend this to include iterative rounds of planning in terms of anticipated business growth and treasury funding activities.
  • Transparency: The ability to understand and document model assumptions, design and structure so they are readily apparent to management and regulators with the elimination of the “Black Box” characteristic of some models.
  • Compliance: The capability to address major regulatory stress testing issues as they evolve with the ability to integrate them into the risk decision making process.

To explore some of the advances in technology that can help you meet evolving requirements, listen to the on-demand webinar “Stress testing: A fresh point of view.” Industry experts discuss how banks are using analytics to:

  • Run full cycle stress tests in days instead of weeks;
  • Create consistent and repeatable processes; and
  • Handle numerous valuation methods, disparate data and stress testing models.
Post a Comment

How does big data power your electricity?

1397487016440The energy & utilities industry as a whole has experienced a seismic shift over the past five years due to rising costs and price pressures, and has become a priority discussion on the political and media agenda. Falling demand overall combined with “peakier” peaks is making supply, forecasting and public perception management more difficult by the day.

For the gentailers (generator-retailers) there is a data-rich environment that is driven by a need for marketing, loyalty and retention, but also to ensure accurate forecasts are aligned with demand and that sound and profitable trading decisions are made. Competition is fierce in the industry as Australia has moved to a contestable market in most jurisdictions. This competitive pressure has meant we are seeing some of the highest customer churn rates in the world – pushing towards 30 percent including connects and disconnects in some states. This has forced retailers to be very focused on what they need to do to maintain customer profitability and prevent customer churn.

An industry driven by customer data

The industry comprises two types of organisations – the network side of the business and the gentailers – both of whom approach their customers in vastly different ways.

On the network side of the business there are continuing cost pressures. The recent modernisation of the network infrastructure has contributed the greatest cost inputs into the overall price in the past five years particularly in the larger jurisdictions such as NSW. This period of increased investment has coincided with the Global Financial Crisis, as well as other environmental, economic and technology factors altering the country’s energy profile in a way never before experienced. As a result this has created a strong move and political will to drive down costs on the network side of the business and for the companies to engage better with their stakeholders.

The ability for utilities companies to better understand their customers, optimise pricing models, and manage demand to meet individual requirements is the driving force for keeping customers.

Data and analytics is helping predict customer behaviour and identify those customers who are most likely to churn and the points at which it is likely to happen. For us as the consumer it means they’ll use that insight gleaned from all marketing channels – call centre, digital and other forms of marketing – to really understand us and ensure our pricing plans are based on our needs.

Blog Matt

Video: Hear how Spanish energy company Endesa has used SAS to improve its performance using SAS leading to a reduction in churn and increased profitability.

And the proof is in the data. For example a Spanish utilities organisation, Endesa, exists in a highly competitive market due to a move from a regulated to a deregulated industry. SAS Analytics was able to help them get to the core of what their customer was thinking and get in front of them in predicting and intervening on the likelihood of churn. Not only has it meant increased customer satisfaction and loyalty but over two years they’ve reduced churn by 50 percent.

Understanding the customer to drive retention is just one part of the complexities faced by energy and utilities organisations. The other part is understanding a customer’s behaviour and in particular how they generate demand for electricity. Interestingly in Australia demand is actually falling and becoming much more based around peak times. Utilities need to understand exactly when those peaks are going to happen so trading decisions are optimised. And network companies are focused on more accurately predicting requirements and identifying where they should apply their capital budget to assets in the short, medium and longer term.

The changing energy landscape in Australia

The uptake of solar and renewable installation across Australia is also contributing to a change, particularly around peak load forecasting in ways traditional forecasting methods struggle to predict. Through advanced analytics gentailers are using the data available to better understand the profile of solar and renewable users and how it affects load predictions, particularly on the peak days. For the company, the service provided is much more aligned with reality and investment isn’t wasted on unnecessary infrastructure.

As an example, one of the utilities organisations in this region has been able to trade much more effectively based on significantly reducing their MAPE (Mean Absolute Percent Error). Typically in ANZ, this has been often 10 percent or higher. This SAS customer has been able to reduce that to figures considerably less than 3 percent. For the company this translates to millions on their bottom line and provides the understanding of the actual requirements of energy users. Increasing accuracy rates means utilities don’t overbuy on electricity – which in turn has a positive effect on consumer electricity bills.

Post a Comment
  • About this blog

    The Asia Pacific region provides a unique set of challenges (ความท้าทาย, cabaran) and opportunities (peluang, 机会). Our diverse culture, rapid technology adoption and positive market has our region poised for great things. One thing we have in common with the rest of the world is the need to be globally competitive while staying locally relevant. On this blog, our key regional thought leaders provide an Asia Pacific perspective on doing business, using analytics to be more effective, and life left of the date line.
  • Subscribe to this blog

    Enter your email address:

    Other subscription options