On the 10th of April, 1912, the RMS Titanic set out on its maiden voyage across the Atlantic Ocean carrying 2,223 passengers. On the 14th of April, it hit an iceberg and sank. There were 1,517 fatalities. Identifying information was not available for all passengers.

The titanic dataset describes the survival status of 1 309 individual passengers on the Titanic.  Besides the survival status (0=No, 1=Yes) the data set contains the age of 1 046 passengers, their names, their gender, the class they were in (first, second or third) and the fare they had paid for their ticket in Pre-1970 British Pounds.

62% of the passengers in the data set did not survive the crash.

You can learn more about the survival rates by building a decision tree on this data set in JMP. How?

1. Select Analyze > Modeling > Partition.
2. Select survival -> Y, Response
3. Select Age, Gender, Class and Fare -> X, Factor
4. Select OK.

What do we learn from the decision tree?

At every age, all classes of women survive at a greater rate than all classes of men. Young boys under the age of 10 have higher odds to survive than older men. This proves that priority was given to women and children to be rescued.

What would be your chance of survival on the Titanic?

If you save the prediction formula (click the red triangle next to Partition for Survival and select Save Columns -> Save Prediction Formula).

You can calculate your survival chance by entering your gender and age in an empty row.  Your odds appear in the Survival Tolerant Predictor Column.

Consider one of these SAS training courses: SAS Enterprise Guide: ANOVA, Regression and Logistic RegressionJMP Software: ANOVA and Regression and Applied Analytics using SAS Enterprise Miner.

Share

Systems Engineer

Nele is an experienced SAS user and joined SAS Belgium as an instructor in 2011. Nele is now part of the team that helps customers see the value in analytics for their business. She likes to use analytics to cover world events.

1. Robbie Windelen on

I believe chance on survival should be much higher as a assume the dataset does not account for people staying on the ship out of free will. If you want to calculate the calculate the chance on survival than you should not account for the people who died on the ship while having the opportunity to save themselfs.

If you did manage to exclude these, I'm very eager to know how.

Best Regards.

• Nele Coghe on

Hi Robbie,
Interesting point of view but I am not sure how to obtain data on the people staying on the ship out of free will?
Kind regards
Nele