The world’s largest rugby tournament returns for the knockout stages. This blog post explores how probability and simulation can be used to predict likely winners in each of the knockout stages. Team sports are dynamic, time-varying and complex topics to model. When modeling regular competitions, such as domestic leagues, it
Author
Many parents when naming their children, want to choose a name that they like, but that isn’t so popular that everywhere they go they hear it being called. But even for the most popular girls’ and boys’ names, how likely is it that there will be children with the same
Introduction In an era of high connectivity and instant gratification, the expectations of customer experience have never been higher. Customers do not simply want but rather expect accessible and responsive communication across a variety of channels. And for organisations, the risks have never been higher. Disgruntled users now have the
In our previous section of the series we discussed the impact of missingness and techniques to address this. In this final section of the series we look at how we can use drag-and-drop tools to accelerate our EDA. As mentioned at the beginning of this series, SAS Viya offers multiple
In our last blog we explored the potential impact of missingness in data in terms of its impact on models which require complete case analysis. We took a simple view that data was missing with an equal, independent, probability for any given model input. This week we explore cases where
In the previous section of this series we discussed ways of assessing the relationship between variables. This week we change the focus to the shape and sparsity of our dataset. One area of Explanatory Data Analysis which we’ve missed so far is the impact of missingness in data. Having missing
In the previous section of this series we looked at basic summary statistics. In this article we start to consider the relationships between variables in our dataset. As part of your Explanatory Data Analysis it is worth looking for correlation between variables. Generally, when referring to correlation we mean the
Following on from my last blog introducing the series, in this section, we’ll take a first look at Explanatory Data Analysis with basic summary statistics. Getting started with a new dataset in analytics can be daunting. It can help when first looking at a dataset to start with basic summary
Following on from my introductory blog series, Data Science in the Wild, we’re going to start delving into how you can scale up and industrialise your Analytics with SAS Viya. In future blogs we will look at how you can augment your R & Python code to leverage SAS Viya
“It doesn’t stop being magic just because you know how it works.” Terry Pratchett, The Discworld Series Welcome to the third, and final, installment of Data Science in the Wild. In Part 1 we were lost in the woods thinking about how to start a data science project. In Part
In my last blog post, I talked about the importance of establishing the right team for data science projects. Here, I’m going to talk about some of the barriers that can prevent successful adoption of data science. You can read my whole "data science in the wild" blog series here.
You’ve finally done it. You managed to stay awake through the endless series of MOOC videos, and you’ve mastered the IRIS data set. You've learned that lm() will build you a pretty nifty model in R, and you can fit a Classifier with SciKit Learn. You know your Neural Net