As a fan of theme parks, I have always been fascinated by the operations behind the length of wait times for the rides. How they’re calculated, how customers react to them, and how the operators make decisions based off them. In the real world, when you attend a theme park, there are usually wait time boards of all the rides to give the customer an idea of which rides will take the longest to get on. This influences their decisions for the day, where they want to start their day in the park and where they will end up throughout the day, since customers generally want to wait in the shortest possible wait for the list of rides they want to go on. These waits will allow users to choose where to start but also indirectly influence where they will want to eat lunch due to their location in the park at that time, which merchandise stores they encounter, and what park entertainment they run into throughout the day. The operators of the park can make decisions to drive customers to particular outcomes and where they may need to improve for the customer experience, as nobody wants to just be standing in line all day…

Source: Getty Images

Within Viya, I was able to create a simulated theme park that had various attractions of different thrill levels, which resulted in different wait times during the day. “Customers” within the data have attributes that determine whether they will decide to enter the queue for a ride depending on their personal profile and what they value in a day at the park. In SAS Viya, a live interactive dashboard for the operators is created so that they can make decisions based off what they are viewing in the data and compare it to historical data of the same attraction. This process can assist operators in determining when areas need more help with staffing or improvement of bottlenecks viewed in the data. Using both SAS Viya, Open Source, and ESRI, I was able to accomplish this (Figure 1).

Figure 1: An operational view of the park and its rides, visualized on SAS Viya

Simulating the park

To begin, I needed data. The data needed to be created from scratch and be able to mimic a typical day at a theme park. To do this I decided on writing a Python script that would run every 5 minutes, creating a new csv file each time with a unique timestamp for each file. I would run this all day from the park’s “opening” and “closing” (from about 8:30am to 9pm). This would create historical data for the project, which I would append into one file to make a singular dataset.

For the rides, I created a list of 47 attractions (rides or shows) and split them between two parks at my imaginary vacation resort. Each of these attractions have a thrill level (out of 10) associated with them (Figure 2). A simulation of a huge rollercoaster that offers guests a whirlwind of thrills would be in the 8-10 range. A 3D movie experience or stage sing-along would be a 1 or 2. Rides slightly less thrilling than the huge rollercoaster would be in the 6-7 range and child-friendly rides that are more dynamic than the shows would be given a thrill level of 3-5. Based off these thrill levels, we will determine rides' wait times. For this project, the higher the thrill value, the longer the wait.

Figure 2: List of rides for the park, Ride Name is column C, each with a corresponding Park, Resort, and Land. Thrill Level associated with each ride is column B

Wait times are also affected by the time and date. To factor these in, I used the datetime Python library. I had to code in some dates as well, like American Thanksgiving which is the 4th Thursday of November and typically a very busy tourist holiday (Figure 3). Other dates added were for typical travelling high and low seasons. The high seasons being June, July, and August which is usually the busy time of year for theme parks as it is warm, and school is out. The week surrounding Thanksgiving, Christmas, and the 4th of July are also considered due to people getting more time off for a statutory holiday. The low season is November, January, and February as people are in school/work and the temperature is lower.

Figure 3: Coding in the specific days of the holidays. Thanksgiving and MLK day are both holidays on specific weekdays instead of day of the year (like memorial day). I had to accommodate for that difference.

Time of day was factored in by creating park events such as firework shows, parades, and mealtimes. Fireworks are scheduled at 8pm and parades at 5pm which lead to shorter wait times due to people wanting to watch the event. Mealtime was scheduled to be around 11-1 and 5-7 to have the same effect, as people are eating instead of riding (Figure 4).

Figure 4: This code shows that from 11-1 the Time Factor is decreased due to people eating lunch. It also shows people leaving the rides to watch the parade from 4:30 to 5:30, since they would go early to a parade to get a good spot.

Each thrill level was met with a numerical addition from the time of day and day of year, with these values being created and inserted into an algorithm that generated the ride wait time. Using this the wait time is listed. The only outside problem was occasional downtime. No ride is perfect and mechanical systems break, so I included ride stoppages in the data.

At this point, our simulation code is finished and we can schedule a job in SAS Viya’s Environment Manager, run it every 5 minutes, and combine all the generated files into one historic dataset. From here we can put this data into memory for visualization.

Customer Attributes

With the historical data created we can review different scenarios with our data. The first being whether someone will ride a ride due to its wait time and time of day. To run this, I made customer caricatures. For example, the Thompson family from a faraway state is here to get as many rides done as possible since they love thrills and don’t visit often. They’re more likely to put up with waiting in a longer line. Whereas Cathleen who lives near the park has a season pass and visits weekly and is therefore less willing to wait as they have experienced that ride countless times (Figure 6).

Figure 6: Customer attributes are listed. The description is for each customer caricature. RiderType is from 0-2 depending on how many thrills they want. The season pass, fireworks, parade, foodie, columns are used to determine those attributes. Max intensity has a negative effect if a rides thrill level is greater than it. Patience is also factored in for a willingness to wait in line. Favorite ride increases a persons willingness to wait.

Other customers prefer watching the fireworks, so during that time (8pm) they will not go on a ride unless it fits in with their goals. There are also people who come just for food and during mealtimes, they want that time to themselves.

Figure 7: A view of the data at 1:30pm on October 30th. Showing an 85 minute wait time for the ride “AfterBurner”. The customers in green will wait in that line, the customers in red will not.

In our SAS Visual Analytics dashboard (Figure 7), we can visualize if a customer will ride a ride when confronted with the wait time. This gives us a chance to analyze how we can improve the experience of those who will pass on a ride they may really want to go on.

Operational Dashboard

Now that we have all our data, we can create a dashboard filled with the information on the status of the park. The wait time window (Figure 8) is where we can find the best time to go on an attraction due to having the historical data.

Figure 8: The wait time, time-series of the ride “Ice Breaker”. The green line is a standard deviation from the mean, showing a preferable wait time to ride. The red is one standard deviation over the mean showing a less preferable wait time. When the graph hits 0 in the middle of the day, that means it has down time and needs to be repaired.

We have the geographic representation of the park. Mapped over Toronto, this map was created through ESRI’s ArcGIS Pro. We can build shapefiles on ArcGIS and upload them directly into SAS Visual Analytics. Using the Geographic provider system, we can make a custom space for the created shapefiles. These shapefiles then are tied to our data through the ID columns and can be visualized.

Each ride is contained within a “Land” (an area that shares a common theme with nearby rides) and each land is within a “Park”. This is visualized by SAS’ geo-hierarchy datatype. With this map we can see the wait times and where people are congested (Figures 9, 10, and 11). This allows the operational team to make decisions to lead people out of that area. For example, they could artificially lower the wait time on an attraction away from that area to draw guests there or create a promotion at a food stand to funnel guests in that direction.

Figure 9: The geo-region of each of the two parks “World’s Faire” and “Adventure Isle”. These are mapped over the city of Toronto
Figure 10: The geo-region of the lands within “World’s Faire”
Figure 11: The geo region of each ride, colored to reflect the wait time at that moment.
Figure 11: The geo-region of each ride, colored to reflect the wait time at that moment.

This map is also aided by running the Python script for generating the data in the background to give us a “current” view of the park. This script would have the SAS SWAT package in our Python code to make a direct upload to the in-memory database.

From here, the operators can run the wait times against one of SAS’ Machine Learning algorithms to predict wait times for future dates, allowing the operations team to make decisions ahead of time.

In Summary

Source: Getty Images

This project was able to combine SAS Viya with Python code, giving us an operational view of an active theme park. We can combine large volumes of csv files into a singular dataset and upload it directly into the environment. Customer attributes can be considered and visualized, giving us an opportunity to hyper-personalize a customer’s day. Lastly, we can use SAS Visual Analytics dashboards to visualize the operation and make business decisions based on it. Open Source and Viya can be used together to create custom dashboards and data queries, which allow the users to make live business decisions based off of the results.

Learn More

READ MORE | More from the same author about project status and optimization
READ MORE | Another use case combining data, coding, and map visualizations
Share

About Author

Danny Sprukulis

Senior Associate Systems Engineer

Danny Sprukulis is a Senior Associate Systems Engineer who has been working at SAS since 2020. At SAS, Danny has been working with SAS Viya, SAS Visual Analytics and Machine Learning, with a focus on Asset Management, Geospatial and Marketing Analytics data. Danny primarily works with data but graduated with an MBA from the Rotman School of Management at the University of Toronto.

1 Comment

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top