Last week I showed a graph of the number of US births for each day in 2002, which shows a strong day-of-the-week effect. The graph also shows that the number of births on a given day is affected by US holidays. This blog post looks closer at the holiday effect. I actually conducted this analysis in 2009 for my book, but decided not to include it.

I want to identify days in 2002 that have fewer births than would be expected, given the day of the week. A box plot is often used for this sort of exploratory data analysis. The following statements use the VBOX statement in the SGPLOT procedure to create a box plot for each day of the week and to label the outliers for each day:

proc sgplot data=birthdays2002; title "US Births by Day of Week (2002)"; vbox Percentage / category=Day datalabel=Date; yaxis grid; xaxis display=(nolabel) discreteorder=data; run;

In the box plots (click to enlarge), the outliers for each day of the week are
labeled by using values of the `Date` variable. Each date belongs to one of the following categories: US holidays, days near holidays, and inauspicious days.

### US holidays

Several US holidays in 2002 are responsible for lower than expected births, given the day of the week:

- Tuesday, 01JAN (New Year's Day)
- Monday, 27MAY (Memorial Day)
- Thursday, 04JUL (Independence Day)
- Monday, 02SEP (Labor Day)
- Thursday, 28NOV (Thanksgiving Day)
- Wednesday, 25DEC (Christmas Day)

Christmas Day is the day on which the fewest babies were born.

Several "minor" holidays on Mondays also exhibit slightly smaller-than-expected births. These are not visible in the box plot graph, but can be seen in the time series graph: 21JAN (Birthday of Martin Luther King, Jr.), 18FEB (Washington's Birthday, sometimes known as "President's Day"), 14OCT (Columbus Day), and 11NOV (Veterans Day).

### Days near holidays

Families often travel on days near holidays, and that includes doctors and other hospital staff. Several of these days are visible as outliers in the birth data.

- Wednesday, 02JAN (day after New Year's Day)
- Friday, 29NOV (day after Thanksgiving Day)
- Tuesday, 24DEC (Christmas Eve)
- Thursday, 26DEC (day after Christmas Day)

Friday, 03JUL (the day prior to Independence Day), also exhibits smaller-than-expected births, as seen in the time series graph.

### Inauspicious days

The following dates are also outliers:

- Monday, 01APR (April Fool's Day)
- Thursday, 31OCT (Halloween Day)

Most parents don't want their child to be teased for being an "April Fool" all his life. It is less clear why a couple would avoid giving birth on Halloween. Superstition? Maybe. Or maybe doctors don't induce deliveries on Halloween so that they can be home for trick-or-treating?

These days might not be preferred for giving birth, but these are both blog-able holidays: I've written Halloween posts and April Fool posts.

Interestingly, for leap years, 29FEB also falls into the "inauspicious day" category. I guess parents avoid that date because the poor child would only get birthday parties every four years? Personally, I think it would be fun to be born on a leap day. And think how impressed people would be when I brag that I completed college before I celebrated my eighth birthday!

