Every day, millions of New Yorkers rely on the MTA subway – a system so vast and complex that even small delays can ripple across the city. Trains arrive, depart and stall in an intricate rhythm that’s constantly shifting.

Capturing that rhythm in real time and translating it into something humans can see and understand is a challenge tailor-made for analytics.

During a summer internship with SAS, that challenge inspired me to build a real-time digital twin of the New York City MTA Subway system using SAS tools.

The goal was to turn live transit data into a dynamic, interactive dashboard that could deliver subway performance insights in real time. The project explored how SAS technologies could turn raw transit data into a dynamic dashboard that tracks trains in motion and surfaces key performance insights second by second.

Defining the subway system

Creating a digital twin began with mapping the subway system’s physical world into a digital twin. Each train, station and track was defined as an asset with attributes such as location, status and route. Hierarchies connected these assets to reflect how they operate together on each subway line.

Static information was sourced from the NYCT-GTFS Python library, which provided key details like stop times, station IDs, and route structures. This information was processed in SAS® Viya® Workbench and SAS® Studio to develop code to process this text information into JSON configuration files. SAS AutoMLForIoT then converted these definitions into a working SAS® Event Stream Processing (ESP) project – effectively bringing the subway’s digital skeleton to life.

Powered by live data

With the system defined, the next step was to connect it to live MTA train status updates, refreshed every 30 seconds. A Python script helped track individual trains, detect when their status changed and determine whether they were in transit, stopped, delayed, or had completed their route.

This streaming was processed in real time and output to three CAS tables – one each for trains, stations and tracks. These tables, accessed through SAS Studio, stored the most current state of the subway system and formed the foundation for analysis and visualization.

Visualizing movement in real time

To make the subway’s pulse visible, SAS® Visual Analytics was used to design an interactive dashboard. The display highlighted system-wide key performance indicators, such as:

  • Trains currently in transit
  • Average time between stations
  • Platform dwell times
  • Percentage of delayed trains

An interactive subway map provided a live snapshot of train locations and routes, while filters allowed users to focus on individual lines. Behind the scenes, SAS code added calculated fields – such as delay flags and movement indicators – so users could instantly understand what was happening without sifting through raw data streams.

Lessons in real-time analytics

This project demonstrated how SAS tools can work together – from data ingestion to visualization – to manage a continuous flow of information. It offered hands-on experience with streaming data, digital twin modeling, event stream processing and the integration of Python with SAS for real-time responsiveness.

SAS Viya proved essential in tying it all together. Its components – ESP Studio, CAS and Visual Analytics – created a seamless environment where raw data could evolve to a working solution in minutes, not hours.

Looking ahead

This project focused on real-time monitoring, but it lays the groundwork for further exploration. Possible next steps include using historical data to train predictive models for train delays or expanding the digital twin to cover buses and commuter rail. The flexibility of SAS technologies makes these extensions not only possible but practical.

As a student, working with SAS on a project of this scale was both challenging and rewarding. It showed me how powerful the platform can be for solving real-world problems. SAS tools gave me the ability to model a complex public system, connect it to live data and create insights that update in real time.

See what else you can do with SAS for digital twins

Share

About Author

Hanwen Zhang

Contributor

I'm Hanwen Zhang, a senior studying Computer and Data Science at NYU. I worked on the MTA Real-Time Digital Twin project at SAS as a Solutions Advisor Analytical Intern.

Leave A Reply

Back to Top