Kentucky Derby: Secretariat was the best of the best!

The first Saturday in May means different things to different people. For me, growing up in New York with a thoroughbred horse-loving family, it meant mint juleps and women wearing hats that monopolized the airspace in the room where we all gathered to watch the “Run for the Roses,” also known as the Kentucky Derby.

The first race of the Triple Crown is the 1 ¼ miles. The second leg is the 1 3/16 mile Preakness Stakes, and the final leg is the grueling 1 ½ mile Belmont Stakes. It is very difficult for a 3-year old thoroughbred to win the Triple Crown since there are so many contenders, coupled with the fact that the races occur over a short period of time (seven weeks). Only 11 horses have accomplished this feat, which dates back to 1875. In fact, it has been 33 years since a horse, Affirmed, won the Triple Crown.

The first race in the series is The Kentucky Derby, which takes place at Churchill Downs in Louisville, Kentucky. Searching for the historic data in Wikipedia and using the Internet Open capability in JMP, I decided to visualize the yearly time performances from 1940 until 2011 using the new Control Chart Builder in JMP 10.

The Control Chart Builder allows me to drag and drop the time performance data and examine the data and the statistical control limits over this 71-year period. Here is the output for the Kentucky Derby data:

Close to being out of control, the Kentucky Derby results show that Secretariat set the record in 1973 for the fastest Kentucky Derby ever, and that record still stands today, 38 years later!

The second race in the Triple Crown series is the Preakness Stakes, “The Run for the Black-Eyed Susans,” held two weeks later at Pimlico Race Course in Baltimore, Maryland. Again, I gathered the data from Wikipedia and read it into JMP using the Internet Open capability and then utilized the JMP 10 Control Chart Builder to visualize the historic times. It is interesting to note that in this particular race, the Pimlico Race track timer malfunctioned. Secretariat’s winning time for that race in 1973 has been subject to much controversy. The time displayed below is the time credited by the Daily Racing Form clocker and indicated a new track record for that event as well. This record, however, is not officially recognized by the Maryland Jockey Club, which instead chose to split the difference between the race track’s malfunctioning timer and the Daily Racing Form clocker’s measurement.

The final chapter in the Triple Crown is the Belmont Stakes that takes place at Belmont Park in Elmont, New York, on the first Saturday in June. This race of 1 ½ miles is the longest of all and is known as the “Test of the Champion” as well as the “Run for the Carnations.” Secretariat won that race in 1973 by 1/16th of a mile or 31 lengths!

Secretariat’s winning time of 2:24 was an incredibly historic feat, as you can see in the control chart. This time, his performance for the Belmont Stakes was indeed a statistically out-of-control event, indicative of special cause variability, and he truly was a special athlete! It took the longest distance race of the Triple Crown series to finally delineate his statistical significance, and it was believed that his dominance in longer races was due to his large heart. An autopsy after Secretariat’s death verified this belief because it was found that his heart was 2.75X the size of a normal horse’s heart. This condition is linked to a genetic condition passed via the dam called the “x-factor.”

Enjoy this year’s Kentucky Derby. Perhaps once again we will witness the magic and athleticism that “Big Red” brought the first Saturday of May in 1973. Meanwhile, you can download my JMP files of the historical times for the Kentucky Derby, Preakness Stakes and the Belmont Stakes and try out your own analysis (free SAS login required for access to files).

tags: Control Charts, Data Visualization, JMP 10

7 Comments

  1. Donna Salter
    Posted April 12, 2012 at 4:16 pm | Permalink

    Very nice article Lou. However, you left out one statistic. Who is going to win this year??

    • Louis Valente Louis Valente
      Posted April 12, 2012 at 7:02 pm | Permalink

      Hi Donna...Very good question indeed! My record at successfully handicapping this race has been poor.

  2. Dave Gathmann
    Posted April 16, 2012 at 12:37 pm | Permalink

    Nice, both in content and how you were able to so easily compute it.
    ... now, how about some articles on the Yankees?!?!

  3. Paul F Gorman
    Posted April 23, 2012 at 9:33 pm | Permalink

    Lou -- Great post. Tells quite a story. If Warren Buffett is a six-sigma guy in investing and finance, then Secretariat is thoroughbred racing's representative. Though Secretariat might even be farther out, nine sigma more likely.

    There is one unique fact that you failed to mention, and it might be more telling of how great he truly was. And it is this: the Derby distance is 1 and 1/4 miles, there are five quarter miles to be run in the race -- Secretariat ran each quarter of his race faster than the one before. Race horses simply do not do this. The first quarter is fastest, the last slowest. Secretariat ran his last quarter mile in under 24 seconds. The first time past the stands, Secretariat was just warming up. The performance has never been duplicated, and like DiMaggio's streak, it probably never will.

    Since you mentioned the X-Factor. Secretariat passed along his X-chomosmome to his daughters. These past ten years the three leading sires in the world are sons of daughters of Secretariat. In this year's Derby, the winner is likely to have this link to Secretariat.

    Thanks for an interesting post.

  4. Donna Salter
    Posted May 4, 2012 at 1:52 pm | Permalink

    Let's have some stats on the placement of the Great Mariano Rivera's cutters.....

  5. Posted May 8, 2012 at 4:41 pm | Permalink

    What do the colors specify? Interesting how the Preakness times steadily decrease from 1939 to about 1970. Any known theories as to why? Didn't happen with the other races. I would recalculate the Preakness chart using just Data after 1970.

    • Louis Valente Louis Valente
      Posted May 8, 2012 at 5:03 pm | Permalink

      Andrew,
      Thanks for your feedback and questions.
      The colors scheme chosen for these charts were a spectral scheme based on time (ROYGBIV) where Red = slow and Violet = fast. I too noticed that pattern you mentioned as well but have not come across any theories as to why that was only shown in the Preakness. The time period analyzed was made so as to include the longest time period that were all run at the same distance. You can download the file from the file exchange and easily redo the control chart with just that subset of data if you like.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <p> <pre lang="" line="" escaped=""> <q cite=""> <strike> <strong>