Calculate your survival chance on the Titanic

On the 10th of April, 1912, the RMS Titanic set out on its maiden voyage across the Atlantic Ocean carrying 2,223 passengers. On the 14th of April, it hit an iceberg and sank. There were 1,517 fatalities. Identifying information was not available for all passengers.

The titanic dataset describes the survival status of 1 309 individual passengers on the Titanic.  Besides the survival status (0=No, 1=Yes) the data set contains the age of 1 046 passengers, their names, their gender, the class they were in (first, second or third) and the fare they had paid for their ticket in Pre-1970 British Pounds.

62% of the passengers in the data set did not survive the crash.

You can learn more about the survival rates by building a decision tree on this data set in JMP. How?

  1. Select Analyze > Modeling > Partition.
  2. Select survival -> Y, Response
  3. Select Age, Gender, Class and Fare -> X, Factor
  4. Select OK.


What do we learn from the decision tree?

At every age, all classes of women survive at a greater rate than all classes of men. Young boys under the age of 10 have higher odds to survive than older men. This proves that priority was given to women and children to be rescued.

What would be your chance of survival on the Titanic?

If you save the prediction formula (click the red triangle next to Partition for Survival and select Save Columns -> Save Prediction Formula).

You can calculate your survival chance by entering your gender and age in an empty row.  Your odds appear in the Survival Tolerant Predictor Column.

Want to learn more?

Consider one of these SAS training courses: SAS Enterprise Guide: ANOVA, Regression and Logistic RegressionJMP Software: ANOVA and Regression and Applied Analytics using SAS Enterprise Miner.

tags: data mining, JMP


  1. Brooke Fortson Brooke Fortson
    Posted April 13, 2012 at 9:33 am | Permalink

    Mark Bailey, JMP course developer and instructor, also discusses Titanic survival data in this Q&A on the new course JMP Software: Introduction to Categorical Data Analysis.

  2. Robbie Windelen
    Posted April 17, 2012 at 4:08 pm | Permalink

    I believe chance on survival should be much higher as a assume the dataset does not account for people staying on the ship out of free will. If you want to calculate the calculate the chance on survival than you should not account for the people who died on the ship while having the opportunity to save themselfs.

    If you did manage to exclude these, I'm very eager to know how.

    Best Regards.

    • Nele Coghe Nele Coghe
      Posted May 3, 2012 at 4:25 am | Permalink

      Hi Robbie,
      Interesting point of view but I am not sure how to obtain data on the people staying on the ship out of free will?
      Kind regards

Post a Comment

Your email is never published nor shared. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>