What age women do men prefer on dating websites? - Let's have a look at the data...
When I first started using computers in the early 80s, I thought it would be great to have everyone take a survey, and then let computers show you who your best matches were. The computer would be a modern day Cupid! ... I don't know what Cupid really looks like, but here's a picture of a likely candidate my friend Reggie has in his collection of antiques:
Speaking of Cupid & online matchmaking ... one of the most popular free dating websites is called okcupid. Each user provides some personal information when they register (such as age), and optionally answers questions on various topics. Users also have the opportunity to interact with other users in various ways such as sending messages and 'rating' the other users on a scale from 1 to 5 ... and the people who run the website have access to all this data!
Christian Rudder was one of the founders of okcupid, and was in charge of their analytics team. His job was to "make sense of the data their users created" - what a great job, eh!?! And he has shared some very interesting graphical analyses in blogs and articles. One of his recent articles analyzed how people rated others' profiles on their dating site, and for each age (20-50) it showed the age of the people those users rated the highest.
Who did the men rate highest? For almost every age of men, the age of the women they rated highest was around 20. Below is my SAS graph very similar to their graph in the article:
It's an interesting graph, but I had to study it for a few minutes, and read the article, to be able to understand exactly what it was saying. And as usual, I couldn't leave well enough alone, and decided to try to make a few improvements to it...
First, I decided to sort the bars so that the older men are at the top (instead of at the bottom), as is customary with population pyramid charts. This small change helped make the graph's layout more familiar to me, and more logical.
Another thing that needed work - all those tiny numbers on the bars were difficult to read (I'm not getting any younger, you know!). So I decided to go with the more traditional approach of showing the numbers as tick marks along the axis, with reference lines. I made my axis symmetrical around the origin (zero). I also added html hover-text to my html output so you can easily see the exact values for a specific bar (click this link, or the graph below, to see the interactive version with hover-text).
I also decided to make the colors a bit more meaningful/mnemonic ... blue for guys, and pink for girls. And one last enhancement that I think is very important - I added a descriptive title to the graph, so people would know what it represents, without having to read through the text of the article.
Now that we've analyzed the men's preferences, how about the women?
Hmm ... so the men all rate the 20 year old women the highest, and the women rate men who are close to their own age the highest? Why the difference? What factors might be affecting this data?
I have a few theories, but first I invite you to share your theories in a comment!