Subscribe to our Newsletter | Join our LinkedIn Group | Post on IoT Central

Time Series IoT applications in Railroads

Guest blog post by Ajit Jaokar

Time Series IoT applications in Railroads

Authors: Vinay Mehendiratta, PhD, Director of Research and Analytics at Eka Software

and Ajit Jaokar, Data Science for IoT course  


This blog post is part of a series of blogs exploring Time Series data and IoT.

The content and approach are part of the Data Science for Internet of Things practitioners course.  

Please contact [email protected] for more details.

Only for this month, we have a special part-payment pricing for the course (which begins in November).

We plan to develop these ideas more – including an IoT toolkit in the R programming language for IoT datasets. You can sign up for more posts from us HERE


Over the last fifteen years, Railroads in the US, Europe and other countries have been using  RFID devices on their locomotives and railcars.  Typically, this Information is stored in traditional (i.e. mostly relational) databases. Information from the RFID scanner provides information about the railcar number and locomotive number. This railcar number is then mapped to existing railcar and train schedule. Timestamp information on scanned data also provides us the sequence of cars on that train. Information from data obtained by scanning RFID on locomotive provides us the number of locomotives and the total horsepower assigned to the train. It also informs whether locomotive is coupled in front of the train or rear of the train.

The scanned data  requires cleansing. Often, readings  from a railcar RFID are  missing at certain scanner. In this case, the missing value is estimated by looking at the scanner reading  before and after the problematic scanner to estimate the time of arrival.

Major Railroads have also defined their territory using links where a link is the directional connection between two nodes.  Railroads have put RFID scanners at major links. 

 An RFID gives information on railcar sequence in train, locomotive consist, and track in real-time.  Railroads store this real-time and historical data for analysis.

Figure 1: Use cases of Rail Time Series Data



Figure 1 above shows use cases of time series data in railroad industry. We believe that all of these use cases are applicable for freight railroads. These use cases can also be used for passenger railroads with some changes.  They involve the use of Analytics and RFID

Uses of Real-Time Time Series Data

Here are some ways that time series data is/can be used in railroads in real-time.

  1. Dispatching: Scanner data is being used for dispatching decisions for many years now.  Scanner data is used to display the latest location of trains. Dispatchers use this information, track type, train type, time table information to determine the priority that should be assigned to various trains.
  2. Information for Passengers: Passengers can use train arrival and departure estimates for planning their journey.


Uses of Historical Time Series Data:

Here are some ways that historical time series data is/can be used in railroads.

  • Schedule Adherence Identify trains that are consistently delayed: We can identify trains that are on Schedule, delayed or earlier. . We can identify trains that consistently occupy tracks more than the schedule permit. These are the trains that should be considered for a schedule change. These are the trains that are candidate for root cause analysis.

  • Better Planning: We would be able to determine if planned ‘sectional running time’ are accurate or need to be checked. Sectional run times are generally determined based on experience and are estimates at network level but don’t consider local infrastructure (signal, track type). Sectional running time is used in development of train schedule and maintenance schedule at network and local level

  • Infrastructure Improvemen - Track Utilization: We can identify the section of track where trains have the highest occupancy. This would lead us to identify tracks that are being operated near track capacity or above track capacity. Assumption here is that Utilization above track capacity would result in delays. We can identify the set of trains, tracks, time of day, day of the week when occupancy is high and low. This would provide us insights in train movement and perhaps provide suggestions on train schedule change. We might be able to determine if trains are held up at station/yards or on mainline.  An in-depth and careful analysis can help us determine if attention needs to be paid to yard operations or mainline operations.

  • Simulation Studies: RFID scan data provides us actual time of arrival and departure for every car (hence train). Modelers do create hypothetical trains to feed to simulation studies. This information (actual train arrival/departure time at every scanner, train consist, locomotive consist) is used in infrastructure expansion projects.

  • Maintenance Planning : Historical Occupancy of tracks would enable us to identify time windows when maintenance should be scheduled in future. Railroads use inspection cars to inspect and record track condition regularly. Some railroads are facing the challenge of getting the right geo coordinates for segment of track. Careful insights of this geo and time series data measure track health and deterioration. Satellite imagery data is becoming available frequently. A combination of these two sources can do well to inspect tracks, schedule maintenance, predict track failures, and move maintenance gangs.

  • Statistical Analysis of Railroad Behavior
  1. We can map train behavior with train definition (train type, schedule, train speed, train length) and track definition (signal type, track class, grade, curve, authority type) and identify patterns.
  2. Passenger trains do affect the operations of freight trains. Scanner data can be used to determine the delay imposed on freight trains
  3. Time series information of railcars can be used to identify misrouted cars or lost cars.
  4. Locomotive consist information and time series data based performance can be used together to determine the best locomotive consist such as make, horsepower (historically) for every track segment
  5. Locomotive is a costly asset for any railroad. Time series data can easily be used to determine locomotive utilization.
  •  Demand Forecasting : Demand for railroad empty cars is known as an indicator of a country’s economy. While demand of railroad cars vary with car type and macro-economic factors, it is worth making efforts getting insights on historical perspective. Number of cars by car type can be estimated and forecasted for every major origin-destination pair. Number of train starts and train ends at every origin and destination can be used to forecast the number of trains for a future month. Number of trains forecasted would help a railroad determine the number of crew, locomotives. It would also help railroad determine the load that tracks would go through.  Number of forecasted trains can be used in infrastructure studies.


  • Safety: Safety is  the most important feature of railroad culture. Track maintenance, track wear and tear ( track utilization) are all related to safety. Time series data of railcars, signal type, track type, train type, accident type, train schedule can all be analyzed together to identify potential relationship (if any) between various relevant factors.


  • Train Performance Calculations: What is the unopposed running speed on a track with a given grade, curve, locomotive consist, car type, wind direction and speed?  These factors were  determined by Davis [1] in 1926. Could time series data help us calibrate the co-efficient of Davis’s equation for railcars with new design?

  • Planning and Optimization: All findings above can be used to develop smarter optimization models for train schedule, maintenance planning, locomotive planning, crew scheduling, and railcar assignment.



In this article,  we have highlighted some use cases of time series data for Railroads. There are many more factors that could be considered especially in the use of Technology for implementing these Time series algorithms. In subsequent sections, we will show how some of these use cases could be implemented based on the R programming language.

To know more about the Data Science for Internet of Things practitioners course.  Please contact [email protected] for more details. You can sign up for more posts from us HERE


  1. Davis, W.J, Jr.: The tractive resistance of electric locomotives and cars, General Electric Rewiew, vol. 29, October 1926. 

Follow us @IoTCtrl | Join our Community

E-mail me when people leave their comments –

You need to be a member of IoT Central to add comments!

Join IoT Central

Premier Sponsors

Upcoming IoT Events

More IoT News

Arcadia makes supporting clean energy easier

Nowadays, it’s easier than ever to power your home with clean energy, and yet, many Americans don’t know how to make the switch. Luckily, you don’t have to install expensive solar panels or switch utility companies…


4 industries to watch for AI disruption

Consumer-centric applications for artificial intelligence (AI) and automation are helping to stamp out the public perception that these technologies will only benefit businesses and negatively impact jobs and hiring. The conversation from human…


IoT Career Opportunities