It’s time again for another update on what has recently happened around KDE Itinerary. Together with the last two month summary this also covers the changes of the extraction engine and the KMail integration that will be part of the 19.04 application releases.

New Features

The biggest visible additions have again been in the area of realtime public transport data. In particular it’s now possible to pick alternative connections for unbound train or bus bookings.

KDE Itinerary showing alternative bus connection options.
Alternative bus connection options in KDE Itinerary.

Thanks to Aleix’s work KDE Itinerary is now also available as Flatpak for x86 and ARM. That provides a convenient way to test KDE Itinerary, even if it’s still missing the dependencies to do barcode decoding currently.

Nico meanwhile further improved the barcode scanning mode, so by double-tapping on the boarding pass barcode we now not only increase the screen brightness to maximum but also inhibit the lock screen on Android, so you are not that person that stalls the boarding queue with fighting with your phone.

Infrastructure Work

For two larger background work items I have already written separate posts:

  • KPublicTransport, a new framework to query online services for public transport information. A large part of the work in the past two month centered around aggregating and merging results from different sources there.
  • Android file opening support in Qt, so we don’t need special cases for that.

There’s a few new things in the data extraction system in the KItinerary library too, which will also improve what you see in the KMail plugin with 19.04 for example.

Custom extractors can now also trigger on properties of iCal attachments. This allows us to pick a suitable extractor based on the backend software being used by an airline or travel agency, making the extraction more generic and easier to maintain.

We now have a system for more cleverly merging data from different sources. So far later information always replaced previous information, now we take the “better” information in cases where we can decide what is “better”. For example for two QDateTime values referring to the same time, the one having timezone information available is “better”, same for two ticket elements of which one has a download URL to a barcode while the other one has the barcode content available inline.

Airport detection also got a larger rework. That is, identifying which airport a given string is referring to. Neither are airport names always unique, nor is the string we are given as a suspected airport name always containing exactly that, so this is inherently fuzzy. Here are two examples of the new additional disambiguation steps to improve the detection:

  • “Stuttgart Airport” in Germany and “Stuttgart Municipal Airport” in the US. Cases like this are very hard to distinguish by their name, so we now consider the flight length to pick the most plausible candidate.
  • “Osaka International” and “Kansai International” are both serving the city of Osaka, and are actually easy to distinguish when addressed by their official names. Airport name strings we find in reservation emails often also contain the city the airport serves though, so we get things like “Osaka Kansai International”. We currently don’t have a way to properly disambiguate such cases, but we are now taking non-disputed information from multiple candidates into account. In this example both airports are less than 50km apart and while we might not know exactly where you are traveling, we do know the country and timezone.

We have also added a command line tool to perform the extraction, which is mainly thought to be an easy way to prototype integration with additional mail clients for example.

Fixes & Improvements

There’s lots of smaller changes that are noteworthy too of course:

  • Timeline elements with a full day scope (hotels, weather forecast, etc) are now ordered correctly in case of timezone changes.
  • One-dimensional barcodes that are too wide for a portrait mode screen are now rotated.
  • A number of custom extractors were added or extended.
  • We no longer keep outdated delay or change/disruption information around when a reservation changes.
  • The display of rental car reservation in the KMail plug-in is now slightly more useful.

Contribute

As usual a big thanks to everyone who donated test data, this continues to be essential for improving the data extraction. More samples don’t only allow us to add extractors for more providers, it also allows us to spot similarities between different providers resulting from the use of the same backend software, and thus enables us to create extractors targeting the backend software directly.

The extractor we created for BCD Travel for example turned out to process a previously unsupported variant of Lufthansa data as well, similarly the Amadeus extractor was able to handle certain KLM emails. Both merely needed minor adjustments to their trigger criteria, and now should also be able to detect data from other providers using these systems.

And to illustrate how even the most unlikely documents can end up being useful: We found the API endpoint for a local transport network now supported by KPublicTransport in the header line of a seven year old PDF website printout.

If you want to help in other ways than donating test samples too, see our Phabricator workboard for what’s on the todo list, for coordinating work and for collecting ideas. For questions and suggestions, please feel free to join us on the KDE PIM mailing list or in the #kontact channel on Matrix or Freenode.