Is trusting your gut the right way to go?

Jay Liebowitz speaks at A2014

Jay Liebowitz speaks at A2014

When I was young, the use of analytics wasn’t widespread – even in very large companies. Organizations relied on their leaders’ experience built on years in the industry. The more experience and knowledge a leader had, the better the decisions they made and the more successful the business was. The introduction of business intelligence and predictive analytics technologies has triggered a shift toward data-driven decision making. That’s a good thing and a bad thing.

Often, basing your decisions on what the data says can be the safest route. But, Jay Liebowitz says, we still need to include our intuition as part of the decision-making process.

Liebowitz is the Orkand Endowed Chair in Management and Technology in the Graduate School at the University of Maryland University College (UMUC). He’s written several books about big data, analytics and decision making. Most recently, Liebowitz published “Bursting the Big Data Bubble: The Case for Intuition-Based Decision Making.”

Liebowitz recommends a balance of trusting your gut reactions without being overconfident. Use analytical techniques to validate or disprove your gut reaction, he says, and then learn from the exercise. He spoke Monday at the Analytics 2014 conference sponsored by SAS.

Trust your gut?

In  the journal article, “When Should I Trust My Gut?” Erik Dane and his associates found that intuition is often as good as analytics if you are very experienced in the domain where you are making the decisions. Liebowitz agrees and warns that the current trend to constrain employee hiring costs by cross-training employees can mean that key employees don’t develop the expertise they’ll need to make sound judgments.

Liebowitz went on to quote an MIT Sloan Management review article describing the value of intuition over statistical analysis. “For many complex decisions, all the data in the world can’t trump the lifetime’s worth of expertise that informs one’s gut feeling, instinct, or intuition.” Leibowitz says gut instinct can be taught, but that it requires time. An example he uses to illustrate the value of intuition in decision making is the career of Wayne Gretsky. Gretsky has been called the smartest hockey player ever. He defined the game for generations to come because of his uncanny sense of where the puck would be and where his team mates were on the rink.

In his autobiography he writes, “Some say I have a 'sixth sense' . . . Baloney. I've just learned to guess what's going to happen next. It's anticipation. It's not God-given, it's Wally-given. He used to stand on the blue line and say to me, 'Watch, this is how everybody else does it.' Then he'd shoot a puck along the boards and into the corner and then go chasing after it. Then he'd come back and say, 'Now, this is how the smart player does it.' He'd shoot it into the corner again, only this time he cut across to the other side and picked it up over there. Who says anticipation can't be taught?”

Of course, Liebowitz doesn’t discount the value of analytics. To the contrary – he believes decision makers should rely on their expertise, but then prove or disprove it based on the data.

Downside to gut reaction?

Another article on cfo.com about big data says, “We generally have good intuition about things that are similar to what we encounter every day … but we have poor intuition about things that are outside of the everyday.” That makes sense – but what about the times you have to make a decision quickly and you don’t have the benefit of analytics? Think about all of the possibilities. For example: Gary Klein uses a ‘pre-mortem’ technique where he critically evaluates the worst possible outcomes of his decision based on all of the information available at the time.

What can you do to improve your decision making from a business intuition perspective?

  • Respect your intuition without rejecting it outright or following it blindly.
  • Ask yourself what prompted your gut reaction.
  • Review the evidence.
  • Elicit good feedback from other experts.
  • Prove or disprove your hunch (this is a good place for analytics).

For more about Liebowitz’ theories about intuition versus analytics, read, “Educating informed 'intuitants.'” In this SAS Insights article, Liebowitz discusses the new UMUC online M.S. in Data Analytics degree where up-and-coming leaders are taught “basic and advanced skills to support strategic and tactical decision making in the new big data world.”

Post a Comment

Big Data @ Work: a conversation on Twitter

TDav2Last month Tom Davenport, renowned international speaker and author of 'Big Data @ Work' came to Dublin as a guest of SAS Ireland. During the event, there was a lively conversation on Twitter, with many great questions answered by John Farrelly and Alan Gormley from SAS Ireland. Here are some of the highlights.

Blog post compiled by Phil Male from SAS UK and Lauren Brennan from SAS Ireland.

Post a Comment

How cheap will gasoline prices go?

Have you noticed lower gasoline prices lately? How low will they go, and how long will they stay down? Let's use SAS to analyze some of the data!...

gas_price

First, let's look at just the price of gasoline over time. Here's a plot of the US average gasoline price, each week since year 2000. I use a very tiny bar (needle) to represent the price of each week, and change to a darker color at each 50-cent price increase. Notice that (in general) the price drops every year in the fall, and stays lower through the winter. Perhaps this is because people travel less, and there is less demand(?) What other factors do you think cause the price drop each fall?

gasoline_prices_plot

Gasoline if made from oil, and of course the price of gasoline is very related to the price of oil.  Saudi Arabia recently hinted that the price of oil might be going down to $80/barrel. I created two graphs where I plot the price of oil and gasoline, so you can visually compare them side-by-side.

oil_prices1

oil_prices

 

The above two graphs definitely seem to indicate there's a correlation, but I wanted a way to visualize this correlation a bit more directly. Therefore I created a scatter plot with the price of oil on one axis and the price of gasoline on the other, and let SAS calculate a regression line through the data. The data points follow the line fairly closely.

oil_gas_correlation

 

Enough about graphs & analytics ... what's the lowest price you've paid for gasoline this fall? How low do you think the price will go, and how long will the price stay down? (Leave your reply in a comment!)

Post a Comment

SAS Certification at Analytics 2014

Analytics2014Attending an industry conference requires an investment in time away from the office and maximizing that investment makes a lot of sense.  In addition to gaining insight into challenges facing the analytics industry today,  discovering and evaluating new products and services, and networking with the largest gathering of analytics professionals in the world, attendees at Analytics 2014 in Las Vegas are also taking advantage of workshops, training classes, and certification exam sessions to  accelerate their personal development.

On Sunday, October 19, all public SAS exams were offered at the Bellagio, host hotel for the conference, and forty four candidates participated in the testing session.  As you would expect at an Analytics conference, Predictive Modeling was the most popular exam, but candidates also took other exams ranging from Base Programming to Statistical Business Analyst Using SAS.  We don’t disclose specifics about pass rates for exams, but I must say that this group of candidates were among the most motivated and well-prepared that we have seen.  There are quite a few brand new SAS Certified Professionals in Las Vegas today.

If you missed out on becoming SAS Certified at Analytics 2014 in Las Vegas, you may be able to take advantage of exam sessions at other major SAS conferences.  If you plan on attending the 2015 SAS Global Forum, April 26-29 in Dallas, Texas, be on the lookout for certification exam opportunities.  We usually discount the price for conference attendees.

Post a Comment

Higher education and analytics

It's my favorite time of year! The leaves are changing. Football is back. And it's also time for our annual Analytics conference.

One of the best parts about my job is getting to attend the conference each year and host the Inside Analytics video series.

Not everyone at the conference gets the chance to have one-to-one time with so many speakers.

My first interview was with Dr. Goutam Chakraborty, professor of marketing at Oklahoma State University.

 

If you're at the conference this week, here's my list of the six things you need to do.

Look for more updates this week. The conference runs from Oct. 20-21.

Post a Comment

How do men rate women on dating websites? (Part 2)

I always recommend looking at data in several different ways, to get a more complete picture of what's really going on - such is the case with the member 'ratings' on dating websites. Let's take a look at some data from a different angle...

cupid_angled

In a recent blog post, I analyzed which age men & women the opposite sex rated most attractive. The graphs indicated that men rated 20-year-old women the most attractive, whereas women rated men closer to their own age most attractive. This sparked quite a bit of discussion (such as the comments in the cross-posting of the blog on allanalytics.com).

So I decided to look at the ratings data in a different way - this time ignoring age, and just looking at how men and women rate each other in general. I found some histograms on p. 16 of Christian Rudder's new book Dataclysm that showed almost what I was looking for, and I then used some graphs from his blog to estimate the data so I could create similar charts in SAS.

Whereas the men of all age groups consistently rated 20-year-old women the most attractive (which produced a very lopsided chart), their ratings of all women in general produced a very symmetrical chart. In Rudder's book he even describes it as "close to what's called a symmetric beta distribution - a curve often deployed to model basic unbiased decisions." Therefore it appears that men are very unbiased/honest in the way they rate women.

okc_rating_curve

By comparison, women rated men very poorly. Rudder mentions that women only rate one guy in six as "above average."

okc_rating_curve1

What causes this huge difference in how men and women rate each other? Is one being more honest than the other? Are they rating based on different criteria (perhaps men are rating based on looks, and women are rating based on whether or not they think the men would make a good mate)? Perhaps women are hesitant to rate a man highly, because they know that will trigger okcupid to send that man a message letting them know which woman rated them highly? What other factors are perhaps influencing this data?

Feel free to leave your thoughts & theories on this topic in the comments section!

 

Post a Comment

Is Hadoop the answer to big data?

HadoopHaving spent a quarter of a century working on databases and on database-related technologies, I have developed an aura of skepticism on any new product that hits the market being presented as the best thing we have ever seen. It’s not that I love to revel in “I told you so” moments, it’s just that I have seen too many products fly high in the sky only to disappear like meteors.

For many, Hadoop’s entrance into the database field meant that technology had finally come up with the only possible instrument equipped with a framework capable of handling “big data.” On top of that, its affordability unequivocally meant that the end was in sight for traditional relational databases that had so far dominated the scene. Today, after much time and effort spent on integrating Hadoop in their environments, many of the companies that were quick to jump on its bandwagon are discovering that despite having an important role in their infrastructure, Hadoop is not the Godsend answer than many thought it would be.

Why is that? The explanation is simple. At the end of the day, Hadoop is another technological tool, just like its relational database counterparts. On the other hand, big data is not about technology, but rather about business needs. This means that Hadoop shouldn’t be considered as the sole player in the field of data analysis. For example, it makes sense to use Hadoop to run broad exploratory analysis of large data, but a relational database is still a better option to perform an operational analysis of what was uncovered. Hadoop is also good for looking at the lowest level of detail in a data set, but relational databases make more sense when it comes to storing transformed and aggregated data. As the Facebook analytics Chief Ken Rudin puts it, “you need to use the right technology to fit your business needs.”

A recent survey commissioned by an IT company, found that more than 30% of the companies interviewed had already deployed Hadoop, with an additional 30% having plans to deploy it within 12 months. Something interesting that came out of the survey was the fact that the majority of these companies planned to combine Hadoop’s data analysis capabilities with the ones provided by other databases that were already integrated in the companies infrastructures. According to the study, the goal was and still is to use Hadoop to perform raw data analysis, while using traditional databases to take care of non-analytic workloads, especially transaction-oriented ones, and perform data analysis on aggregated data coming from Hadoop.

Take eBay, for example. The San Jose, Calif.-based company’s three-tier data analytics approach is an example of the kind of role Hadoop can find within an organization alongside other traditional relational databases. Structured data resides in the first tier, an enterprise data warehouse that is used for daily housekeeping items, such as feeding business intelligence dashboards and reports. The second tier consists of a Teradata data management platform that is used to store huge amounts of semi-structured information. Fully unstructured data such as textual information lives in the third tier, a Hadoop cluster reserved for deeper research, analysis and experimentation.

The moral of the story is that Hadoop is not a synonym for big data, but one of the many players you need to mine and analyze your data. A good reason to hang on to those other databases a little longer.

I’ll be talking about big data and Hadoop at Analytics 2014 along with Josh Wills from Cloudera and my SAS colleagues Wayne Thompson and Kelly Hobson. Check out our panel presentation and round table discussion on Hadoop. We hope to see you there!

  • Panel discussion with SAS and Cloudera on Big Data and Hadoop: Moving beyond the hype to realize your analytics strategy with SAS® - Monday, October 20, 3:00-3:50 pm
  • Round Table discussion on Practical Considerations for SAS Analytics in a Hadoop Environment – Tuesday, October 21, 12:30-1:45 pm

You can also check out our starter services on Visual Analytics and Visual Statistics and the Expert Exchange for Hadoop.

Post a Comment

The SAS model factory – a big data solution

Do you have too many models to build, too many to manage, too few analytic resources or too much data?  A Model Factory may be your answer.

The mindset of analytics is changing.  This represents the transformation from a “craftsman” dominated culture in which multiple weeks were spent cycling through data and developing a model; to a production-oriented environment where analytically derived information almost instantaneously follows the strategic conceptualization of ideas.

This transformation is significantly accelerated by the integration of the SAS Model Factory.

The idea of a “Model Factory” may make one reminisce of a mechanical age of smokestacks and assembly lines.  When Henry Ford revolutionized the car making process by introducing the assembly line – the process that is still used worldwide in auto manufacturing today – he laid the foundation for the democratization of the car. This assembly line reduced the cost of making a car to an amount that made it sellable to a much larger audience.

What do we really mean by Model Factory?

A factory is defined as where something is made or assembled quickly and in great quantities.

A model factory is defined as where predictive models are automatically built quickly and in great quantities enabling an automated scoring process.

Why would you use a Model Factory?ModelFactory

  • Perhaps you have limited technical and/or analytic resources.
  • You have too many models to build and manage because you have various target variables and/or you segment your customers prior to modeling.
  • If you have 1000’s of customer attributes, you may need to select only a subset that is appropriate for each model.
  • Perhaps you need to perform repetitive data preparation with variable transformations, handling of missing values, etc.
  • You have Big Data which slows down model building and scoring.
  • In brief, you are unable to build models fast enough.

Can the model factory process be automated?

It consists of:

  • Model Initiation
  • Model Development
  • Model Deployment
  • Model Monitoring
  • Model Recalibration/Rebuild
  • Model Retirement

From a Factory Perspective, it looks like:

sas model factory

You choose to write a code-based Model Factory

You can use Base SAS and SAS/Stat with the High Performance Procedures to enable 100’s or 1000’s of models to be built automatically on as much data as you have.  With the needed code, your data will be structured properly.  Transformations, and missing values will be automatically handled.  Good enough models will be built.  And, no analytical skills will be needed to run the process.

Model Factory Deployment

  • Run Macro Driven Code
  • Parameter file

–      Manual entry
–      Point-and-Click entry

  • Code processes parameter file and data
  • Code runs analytic models
  • Model Factory code produces Scoring code

SAS has other solutions for model building

If you have fewer models to build and/or you have the needed analytic resource for model development, these Point-and-Click solutions may be sufficient:

  • Enterprise Miner
  • Rapid Predictive Modeler – run from Enterprise Guide

What can you do to build a Model Factory?

  • Take classes in Data Mining techniques
  • Read documents about data mining
  • Have internal working meetings to review goals and desired results
  • Engage consultants

In summary, we understand that you have experienced the chaos associated with building and maintaining a multitude of models.  The solution to your modeling problems may be the Model Factory Solution which replaces the chaos with automation, efficiency, and repeatability.  For more information, you may contact the author.  For more on this topic, attend the SAS Model Factory pre-conference workshop at Analytics 2014 in Las Vegas on Sunday, October 19, 2014, 1-5 pm.

Post a Comment

How does Amazon deliver packages so quickly?

With Amazon Prime's 2-day shipping, I seldom go to physical stores any more. How do they deliver items so quickly? Let's analyze some data to find out...

There are very few services/memberships that I truly feel like I'm getting "a good deal" for my money - and Amazon Prime is one of them. Amazon has a huge product selection, with detailed information about each product. There's also a good search engine that always seems to work the way I want it to. And they have a large customer base, and the customers frequently post very useful product reviews & ratings. But the feature that impresses me the most is the 2-day shipping! I used to hate ordering things through the mail, because it typically took 5-10 days - but with Amazon Prime's free 2-day shipping, I usually have the item quicker than I could have found the time to drive to physical stores shopping for it.

For example, here's an exact replacement antenna I recently bought on Amazon for my vintage 1980's Conion boombox. Believe me - I could have driven around town for weeks looking in electronics stores, and still not found one!

antenna

How does Amazon deliver their packages so quickly? Some claim that they use a fleet of unmanned drone aircraft to deliver their packages. They don't - or at least not yet! (see Amazon Prime Air proposal)

AMAZON TESTE LA LIVRAISON DE COLIS PAR DES DRONES

But what they do have is a network of huge distribution warehouses, strategically placed across the US. So, while you're shopping online, they check to see if the item you want is in a warehouse close enough to your location that it could be delivered within 2 days. And, of course, they have people working in the warehouses around the clock pulling the items you order, and packaging them to ship immediately after you order them.

I was curious which warehouse(s) were closest to me, and found a map on the Amazon website. I could see that three states bordering North Carolina have a distribution center, but I couldn't tell exactly where the warehouses were located within each state. I did a bit more searching and found an article that listed the addresses of the warehouses. With that info, I was able to use Proc Geocode to estimate the latitude/longitude of each warehouse, and plot them on a SAS map (click the map below to see the interactive version with html hover-text over each marker, and links to bring up a Google satellite map of each location):

amazon_fulfillment_centers

 

 

Post a Comment

How do men rate women on dating websites?

What age women do men prefer on dating websites? - Let's have a look at the data...

When I first started using computers in the early 80s, I thought it would be great to have everyone take a survey, and then let computers show you who your best matches were. The computer would be a modern day Cupid! ... I don't know what Cupid really looks like, but here's a picture of a likely candidate my friend Reggie has in his collection of antiques:

cupid

Speaking of Cupid & online matchmaking ... one of the most popular free dating websites is called okcupid. Each user provides some personal information when they register (such as age), and optionally answers questions on various topics. Users also have the opportunity to interact with other users in various ways such as sending messages and 'rating' the other users on a scale from 1 to 5 ... and the people who run the website have access to all this data!

Christian Rudder was one of the founders of okcupid, and was in charge of their analytics team. His job was to "make sense of the data their users created" - what a great job, eh!?! And he has shared some very interesting graphical analyses in blogs and articles. One of his recent articles analyzed how people rated others' profiles on their dating site, and for each age (20-50) it showed the age of the people those users rated the highest.

Who did the men rate highest? For almost every age of men, the age of the women they rated highest was around 20. Below is my SAS graph very similar to their graph in the article:

okc_rating_men

It's an interesting graph, but I had to study it for a few minutes, and read the article, to be able to understand exactly what it was saying. And as usual, I couldn't leave well enough alone, and decided to try to make a few improvements to it...

First, I decided to sort the bars so that the older men are at the top (instead of at the bottom), as is customary with population pyramid charts. This small change helped make the graph's layout more familiar to me, and more logical.

okc_rating_men1

Another thing that needed work - all those tiny numbers on the bars were difficult to read (I'm not getting any younger, you know!). So I decided to go with the more traditional approach of showing the numbers as tick marks along the axis, with reference lines. I made my axis symmetrical around the origin (zero). I also added html hover-text to my html output so you can easily see the exact values for a specific bar (click this link, or the graph below, to see the interactive version with hover-text).

okc_rating_men2

I also decided to make the colors a bit more meaningful/mnemonic ... blue for guys, and pink for girls. And one last enhancement that I think is very important - I added a descriptive title to the graph, so people would know what it represents, without having to read through the text of the article.

okc_rating_men3

Now that we've analyzed the men's preferences, how about the women?

okc_rating_women3

Hmm ... so the men all rate the 20 year old women the highest, and the women rate men who are close to their own age the highest? Why the difference? What factors might be affecting this data?

I have a few theories, but first I invite you to share your theories in a comment!

 

Post a Comment