Introducing KPublicTransport
One of the larger missing pieces for KDE Itinerary is access to dynamic data for your current mode of transport. That is delays or other service interruptions, gate or platform changes, or finding a specific connection for e.g. an unbound train booking. At least for ground-based public transport we have been making some progress on this in the last few months, resulting in a new library, KPublicTransport.
Data Model
The information we are talking about here are essentially how to get from A to B, and whether a given journey from A to B is on schedule. Stop and route schedule information are often found in the same context, but since we don’t need that for KDE Itinerary those aren’t considered here.
A bit less abstract and following how Navitia names things, this gives us the following objects to work with:
- Location: A place identified by a name, geo coordinates and/or a data source specific identifier (such as UIC station codes). While generic places are usually supported, we are mostly interested in a specific sub-type, stop areas or stop points. That is places a public transport line stops, such as train stations or bus stops.
- Lines and Routes: A line is a set of stops connected by a public transport service (e.g. “Circle line”), a route is specific service on that line, typically identified by the direction.
- Departures and Arrivals: That is, information about when and where a specific service arrives or leaves from a stop.
- Journeys: A journey consists of multiple sections, that combined describe how to get from location A to location B, using public transport services, but possibly also containing walks to and from a stop, or transfer instructions.
The properties these objects can have vary depending on the backend providing the information, ranging from basic things like human-readable names, times or geo coordinates to things like the CO2 impact of a journey or facilities found at a station or inside a train.
To obtain this information, we need the following query operations:
- Given two locations and a timeframe, find a set of possible journeys.
- Given a stop and a timeframe, query departures or arrivals.
- Disambiguate or auto-complete locations. That’s essentially querying locations by name. This is needed if all we have as a destination is e.g. “Berlin”, which could refer to dozens of train stations.
Depending on the backend, those queries may support additional parameters, such as preferred modes of transportation or your walking speed.
This model is largely based on the most complex case, local public transport. Long-distance railway services tend to be closer to flights (using per-day unique service identifiers), but can be represented by this as well. Flights are out of scope here.
To simplify the implementation, this is also closely following what Navitia and other vendor-specific APIs provide, rather than the needs of KDE Itinerary. We’d for example be much more interested in querying delays and disruptions for a given journey section, rather than for a given stop. This can however be achieved using the above building blocks.
Online vs Offline
All this isn’t exactly new though, the KDE4-era public transport plasmoid had this already. However that focused on GTFS as a data source in its later stages. GTFS is a Google-defined standard format for providing the entire base schedule of a public transport network. This has the advantage of being fully offline capable, once you have transferred the data set once. While this somewhat works on desktop, the disk space and bandwidth cost for mobile devices might already be to high. Things get even worse when looking at GTFS Realtime, which is designed for providing live data on an entire public transport network, in the extreme case containing high frequency position updates for all vehicles in the network.
GTFS was designed to feed this information into applications like Google Maps, not for direct consumption by every single user. From a user point of view we would like to only deal with the data for the journey we are interested in to conserve disk space and bandwidth usage, even if that means giving up the possibility for offline support. Offline support however cannot work with realtime data anyway, and that’s one of the most valuable features here.
So, what we would need is a network service that aggregates base schedule and realtime information provided by public transport networks via GTFS or any other format, and that allows us to query for the data we are interested in. And that’s exactly what Navitia does.
Online Backends
Unfortunately reality is slightly more more complicated, as it’s not enough to just support Navitia as a backend. While Navitia has extensive coverage, some important providers are missing and are not providing data that could be fed into Navitia. SCNF is one such case that is relatively easy to support, they are running their own Navitia instance. Others such as Deutsche Bahn have their own APIs.
So we need a system that can use several Navitia instances as well as vendor specific APIs as backends. While the concepts map fairly well to the services I have seen so far, aggregating results from different backends brings in new challenges such as aligning different naming and identification of lines or stops.
Implementation
The code used to be part of KDE Itinerary until this week, but has now been moved to its own repository and library, KPublicTransport. Having this as a standalone framework makes sense as there are other use-cases for it beyond the irregular travel scenario covered by KDE Itinerary, such as helping with your daily commute.
Next to the implementation, this also contains three example applications to trigger the three types of queries mentioned above, two of which are shown below. There’s also some initial API documentation.


What’s next?
While KPublicTransport is at a point to build prototypes with it, there are of course still many things to do. Besides completing and extending the implementation, provider coverage outside of Europe is something to look into too, the current backends and test-cases are all largely Europe-centric. If you come across GTFS datasets not integrated into Navitia’s dataset yet, pointing the Navitia team to those is an easy way to help. Integration into KDE Itinerary is another big subject, I’ll write a separate post about that.