May the 4th be with you: 2016 edition

Most folks who know me, know I'm a bit of a Star Wars geek.  I've analyzed the original trilogy scripts and documented my findings in a paper called Star Wars and the Art of Data Science.  I'm always looking for excuses to get my hands into Star Wars data, and May the 4th is a great annual excuse!

Now that "The Force Awakens" has been around for almost half a year, I thought I'd take the opportunity to delve into it.  Many aficionados have claimed that "The Force Awakens" is nothing more than a rehash of "A New Hope."  What do I think?  I think there are some striking similarities.  Desert planet, droids, budding Jedi, super-weapon, battles, rebels ... it's a formula that worked the first time, and, I have to admit, I liked it the second time around as well!

I thought it would be fun to use a little text analytics and visualization to see if the scripts themselves are thematically similar.  Sometimes letting the data tell the story can be more eye opening than conjecture, plus it gives you a nice stake in the ground if you want to get into a heated debate!  These examples are not all-encompassing, and are just meant as a bit of a teaser!

Read More »

Post a Comment

Could a recommendation engine pick 'The Cuckoo’s Calling' as a best-seller?

booksandtabletHow many of you have read The Cuckoo’s Calling by the previously unknown author Robert Galbraith? The answer is not many, until it came out that Robert Galbraith was none other than blockbuster best-selling author JK Rowling. Sales then skyrocketed. Rowling recently published some of the rejection letters she received as Robert Galbraith. If algorithms had been applied, could they have done any better at recognizing potential of the book?

In this post, I want to discuss one set of algorithms called recommendation engines – also called collaborative filtering – and their various guises. They’re typically used when you have many prospects, customers and products.

For example, let’s say you want to recommend a video to a prospect from a vast range of potential videos. The recommendation engine looks at videos people have bought in the past to make the best recommendation.

You may come across recommendation engines in a wide range of places:

  • Online shopping – most major online retailers use recommendation engines to suggest other products.
  • Film and TV – I did a project a few years ago where we were recommending TV programmes based on what people had streamed. Needless to say, we beat the editor’s recommendation by a fair margin.
  • News articles – people who read these articles might also want to read these ones.
  • Dating sites – more controversially, these sites can use algorithms to suggest potential dates.
  • Crime – I’ve used similar techniques to detect crimes committed by serial offenders.

Read More »

Post a Comment

Three ways to monetise your data

There’s no such thing as a free app. “What?” I hear you say, “but I download free apps all the time!” So then why do organisations spend considerable time and effort creating free apps? Often their goal is to collect data and turn it into money.

selfie-2Consider this example. There’s a popular photo editing app in the US that allows you to edit photos as you take them. Yes, your bestie looks fab with that hat on, but what you might not realise is that your photos are being used to create a heat map of tourist activity. Over time, trends will emerge from that location data which the app developer can sell to cafes, stalls, bars and anybody else looking to set up shop in a profitable area. Forget eyeballing foot traffic! Today’s app data can tell restauranteurs the hottest spots on the hottest days at the hottest times.

And there you have it – the data monetisation of a mobile app. In this example, no personal data was captured – the legal requirements of using personal data is something all companies are grappling with (with opt in clauses built to withstand the tests that may eventually be exercised against them).

Monetising data is one of the most exciting outcomes of the big data analytics world. So how can you monetise your data? Here are three common ways traditional businesses are doing it. Read More »

Post a Comment

Does big data spell big trouble on the campaign trail?

104383981The timeline on the latest season of Netflix’s series House of Cards has finally caught up with the real world, and the current plot line regarding President Frank Underwood’s underhanded dealings to win the Democratic nomination has many parallels with the current US primary election coverage saturating TV and print news across the globe.

The big money, high stakes nature of US elections, underpinned by a vast campaign machinery, technology and data, paints an interesting contrast with the relatively parochial and low-tech nature of emerging market elections, such as in South Africa’s upcoming local elections.

A recent episode outlined a scenario where a fictional search engine was “collaborating” (or, more accurately, colluding) with the Republican presidential primary candidate to provide voter search history. While the search engine claimed that such data was purely used at an aggregate level, it was strongly hinted that search engine history could be used to identify individual voters’ preferences and target specific campaign messages to them.

My last post outlined how political parties are leveraging lessons learned in data-driven marketing and applying them to one-to-one voter influence. But what if House of Cards is indeed prescient and campaigners begin to take liberties with consumer data for nefarious purposes?
Read More »

Post a Comment

All quiet on the Barnett Front

The Barnett Shale in North Texas hit a historic mark on April 25: Its rig count fell to zero. Two hundred rigs once harvested the 40 trillion cubic feet of natural gas in this massive basin, stretching beneath 17 Texas counties. Today, nothing.183346796

This dramatic silence in North America’s second-largest shale field is echoing across the continent. Oil and gas rig counts have fallen by 540 over the past year. It is a stark reminder that credit risk management is growing in importance as the commodity price downtrend continues.

That echo is heard not just in the oilfield, but in the boardrooms of every oil and gas producer, services firm, pipeline and storage company – and all the other strands in the web of relationships that bring energy to market. The enthusiasm to invest as America became a net exporter of hydrocarbons amidst an unprecedented boom in shale oil and gas recovery has transformed into a single question: What’s our counterparty credit exposure?

Read More »

Post a Comment

Analytics in the news: the NFL draft

Football field showing 30-yard-lineAs American football teams prepare to select new team members later today, fans and pundits can only guess how the draft will turn out. Will your favorite professional team make good picks? And will your favorite college players go to good teams?

With high stakes and billions of possible outcomes, professional sports team selection seems like an ideal problem for analytics. But is it? The "Moneyball" method has been famously documented in baseball, but football is a very different sport with more interactions between players and fewer individual statistics to track.

What are the experts saying about analytics and the NFL draft? I've put together a short reading list so you can learn all about it before today's draft. Read More »

Post a Comment

How to partner with IT to build a dashboard for community college administrators

It's a common problem in any industry: getting a large number of similar requests for information. But with limited resources and an already overburdened staff, how do you handle it?

At El Paso Coanalysts and IT shake handsmmunity College, analysts from the Institutional Research (IR) team enlisted the help of IT to create a data warehouse and a dashboard to make reports easily accessible for anyone who needed information while at the same time freeing up time for the analysts.

In particular, they needed a dashboard that would display key performing indicators (KPI) including demographics, student performance, college growth and more. In all, they wanted one central place where everyone could go to get accurate, timely information.

Presenters Christina C. Frescas (Research Associate), Angeles Vazquez (Statistical Research Associate), and Carlos Molina-Torres (Sr. Programmer Analyst) discussed the collaborative solution during their session at The Texas Association for Institutional Research (TAIR) conference. Read More »

Post a Comment

Why clinical insights, not budgets, hold the key to value-based healthcare

Right now, National Health Service (NHS) managers and clinicians in the UK are under phenomenal pressure to find big efficiency savings while improving the value of services to patients. Many in the NHS see integrated care as the answer. But the first step is finding innovative ways to increase the value we deliver in all clinical and commissioning decisions. The question is: How? NHS

Well, consider this: Every single thought, action, treatment plan, decision and interaction generates some form of data. The answers NHS leaders need are likely sitting in the masses of patient records, emails, discharge letters, scans and patient notes that litter every clinician’s in-tray and inbox.

The challenge is collating that data in a meaningful, structured manner that makes it readily accessible to decision-makers --  while protecting individual patient's privacy. But that’s just the beginning. Only when a clean data repository has been created, in which different types of information have been translated into digital formats, can decision makers extract the answers they need. Using sophisticated methods that allow them to easily model treatment outcomes for different patient groups, clinicians and managers can gain transformational answers. They can then evaluate investments versus the value of potential outcomes; investigate the efficacy of different management plans; predict demand, and answer many other questions.

Read More »

Post a Comment

Campaigning to your customer: When elections and marketing collide

185556750It’s almost impossible to avoid election coverage right now, no matter how hard you try. If you’re like me, you’re fleeing to the safety of South Africa’s recently launched Netflix in to order avoid the coverage of the US primaries currently dominating international TV and print news, or South Africa’s local elections which are currently ramping up. But these same elections are eerily echoed by President Frank Underwood’s attempt to finagle his way to re-election in the latest series of Netflix’s House of Cards.

Campaign promises and issues-based pandering are very much part of the campaigning process, and politicians try every method at their disposal to try to influence those “swing voters” who are open to persuasion. But just how far are they willing to go to achieve this goal?

An election campaign is just a (very) expensive marketing campaign

As is becoming clear, in an established democracy like the USA, or even within a new and dynamic one like South Africa, an election is becoming more and more like a marketing campaign in which voters are presented with “offers” of promises made by the candidates. In doing so, they’re tapping into techniques used for years by consumer marketers.

Political candidates and their campaign teams are often forced into making assumptions about voter preferences based on demographic and psychographic information. However, consumer marketers realised years ago that restricting your understanding of customers to such basic information is nowhere near enough, especially given the reams of customer data that’s now available.

Marketers are now moving towards the “segment of one,” where predictive analytics are used to determine each individual customer’s personal brand, product and service preferences, and use this information to tailor unique messages to these customers.

Read More »

Post a Comment

10 SAS Global Forum speakers to follow on Twitter

Like what you heard at SAS Global Forum? Want to stay in touch with the speakers you met or listened to there? Here's a list to get you started, but please add to it in the comments. Tell us which speakers you've found - and followed - on Twitter.

For extra fun, I've included a tweet by or about each person on the list. You'll find some bonus people to follow if you look closely at those tweets.

1. @michaelraithel conducted a pre-conference workshop on how to be a top programmer, and presented a paper about PROC DATASETS.


2. @annmariastat taught attendees about factor analysis and led a second presentation for biostatisticians.

Read More »

Post a Comment