skip to main content

Taxi GPS data shows hurricane's effect on NYC traffic

10/21/2014 5:26:00 AM

Yeh Center
Yeh Center
Brian Donovan and Dan Work
Brian Donovan and Dan Work

CEE M.S. student Brian Donovan, left, and CEE Assistant Professor Dan Work stand in front of a visualization of GPS data from taxicabs showing how Hurricane Sandy affected traffic in New York City. View a 3-minute video of this visualization.


When Hurricane Sandy struck the east coast in late October 2012, the “superstorm” disrupted traffic in New York City for more than five days, but the evacuation proceeded relatively efficiently with only minor delays, according to transportation researchers at the University of Illinois. The largest Atlantic hurricane on record, Sandy offered a chance for Illinois researchers to try out a new computational method they developed that promises to help municipalities quantify the resilience of their transportation systems to extreme events using only GPS data from taxis.

Dan Work, an assistant professor in the Department of Civil and Environmental Engineering (CEE) and Brian Donovan, an M.S. student in CEE’s Sustainable and Resilient Infrastructure Systems program, analyzed GPS data from nearly 700 million taxi trips—representing four years of taxi travel in New York City—to determine the city’s normal traffic pattern and study the variations during extreme events like the hurricane and snowstorms. The data, routinely recorded by taxi meters, shows travel times and the metered distance for various trips around the city at different times of the day and night. The researchers’ method works by computing the historical distribution of pace, or normalized travel times, between various regions of a city and measuring the pace deviations during an unusual event.

“The first step was to figure out from the data what is normal,” Work said. “There is a heartbeat pattern to the city every single day. In the middle of the night when traffic is light, you can get from one side of the city to another very quickly, and every morning during rush hour the roads are congested. The data shows us the typical heartbeat, and then we look for the arrhythmia.”

A resilient transportation system is one that can weather an extreme event with only minimal damage or service disruption and bounce back to normal relatively quickly, Work said. When cities know how their traffic systems respond to extreme events, they can examine ways to improve them. For example, an unexpected effect of Hurricane Sandy was that the longest traffic delays occurred as people returned to the city to resume their normal activities immediately after the storm, Work said.

“That was the one surprising piece to us,” Work said. “A lot of literature on disasters has been very much focused on how to get people out of the city quickly and safely. It makes sense. But the re-entry process is also important, because you don’t want your first responders stuck in gridlock.”

There is still work to be done to translate this research into improved infrastructure resilience, Work said, but now there is a way to quantify the progress at a city scale.

“Importantly, this project shows us that the period immediately following the disaster should be the focus of additional research, with the ultimate goal of enhancing post-disaster transportation management and policy,” Work said.

The researchers obtained the taxi data through a Freedom of Information Law request to the New York City Taxi and Limousine Commission, which already collects it routinely. This gives it advantages over traditional methods for monitoring traffic that rely on sensors in the roadway or video cameras; that equipment can be expensive to deploy throughout a city, Work said.

“Although the taxi data isn’t primed for traffic monitoring purposes because it is so coarse, with the right processing, you can still see things about the city-scale performance that you would expect to observe from a dense network of traditional traffic sensors,” Work said.

“One thing that I think is kind of cool about this project,” Donovan said, “is that taxis are designed to just get people from point A to point B, but this is a second use for them. The taxis themselves act as sensors to tell you what’s going on in the city.”

With 700 million records, the size of the data set creates its own set of challenges.  One of Donovan’s significant contributions to the project involved optimizing the efficiency of the calculations to speed up the analysis, Work said.

“One of the major challenges when you’re dealing with a large data set like this is that you don’t want the program to run for 24 hours. In a disaster, that’s too long to wait; you need an answer immediately.  So you have to design the algorithms appropriately,” Donovan said.

Donovan, who earned his bachelor’s degree in computer science, was drawn to the M.S. program at Illinois because of the opportunity to work on multi-disciplinary projects like this one, he said, through the Sustainable and Resilient Infrastructure Systems program.  The combination of computer science and transportation systems knowledge is the key to the success of a project like this, Work said.

“Our background in transportation engineering helped us choose the dataset, clean it, and determine the performance metrics to study,” Work said. “At the same time, we needed the right computational tools to be able to process this much data and turn it into actionable information.”

A paper on this work, “Using coarse GPS data to quantify city-scale transportation system resilience to extreme events,” will be published in the Conference Proceedings of the Transportation Research Board in January 2015. A preprint is available online. The researchers have also made the data set available to the public, and the source code is available on Github.

Work is also a research assistant professor in the University of Illinois’ Coordinated Science Laboratory and a faculty fellow in the National Center for Supercomputing Applications (NCSA), also based at Illinois. Donovan is a graduate research assistant in both CEE and NCSA.

This material is based upon work supported by the National Science Foundation under Grant No. CNS-1308842. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.