Deliverable 1.3 Big Data Methodologies, Tools and Infrastructures

Big Data opens up new opportunities to define “Intelligent” mobility and transportation solutions. The transportation industry is a leader in creating the so-called Internet of Everything. Each day vast volumes of data are generated through sensors in passenger counting and vehicle locator systems and ticketing and fare collection systems, just to name a few.

The goal is to create value out of this amount of data, by providing a comprehensive picture of what’s happening, using business analytics, leveraging big data tools and predictive analytics, to help transportation agencies improve operations, reduce costs and hopefully better serve travelers.

The technical challenge is that much of this Big Data is non-standard data (e.g., social, geospatial or sensor-generated data that does not an easy fit into traditional, structured, relational data warehouses or databases).

An additional challenge is that with such an amount of real-time structured and unstructured data captured from a variety of sources, it is difficult to determine which data is most valuable. Terabytes of data are collected and result in an added complexity to the underlying IT infrastructures.

These terabytes of data require immense amounts of storage in silo after silo of transportation operator data centers. In order to analyze Big Data, an appropriate Data Infrastructure needs to be in place to:

  • store and maintain data
  • analyze data
  • present results in a clear visual way

Several Big Data platforms have been proposed recently, open source and proprietary. In order to tackle the demands and challenges in the transportation domain, an optimal stack of Big Data technologies needs to be selected and designed based on the application requirements.

This is not an easy task.

This report, which is a follow up of Deliverable 1.1, offers an in-depth introduction to relevant technologies for Big Data Analytics and Big Data Management. It also looks at how these technologies are applied to build a Big Data Platform suitable for the transport sector. We present in detail how application-specific benchmarking can be used in order to evaluate which Big Data technologies are most suited for the domain. We conclude the report with an applied example of using data analytics for urban mobility.

This document offers the reader a technical insight into existing Big Data technologies at various levels: software management, data platform, and application. In order to evaluate which specific software components in the Big Data stack are more suitable for transport applications, with high volume and high-velocity requirements, a benchmarking approach is presented.

The future of data analytics in transportation has many applications and opportunities.

The main challenge is using significantly improved technologies and methods to gather and understand the data in order for business decisions to be informed by better insights