OSM Hack Weekend Karlsruhe 2023

Last weekend I attended an OSM hack weekend, hosted by Geofabrik in Karlsruhe, to progress some OSM-related topics around KDE Itinerary and to better connect KDE and OSM.

Platform section highlighting

The most visible result is that our train station map can now also highlight relevant platform sections, at least when platform sections are mapped in OSM. That’s rare in Germany at least, with Karlsruhe being one of the few exceptions, and thus providing the perfect opportunity for real-life testing.

Train station map of Karlsruhe Hbf with platform 3 section E and F highlighted. — Train station map in KDE Itinerary highlighting relevant platform sections.

This is currently only using schedule data, we still have to use the seat reservation and coach layout data when available in Itinerary for maximum usefulness.

Raw data tiles

Another noticeable result are the performance optimizations of the map geometry reassembly. That’s the process of putting the raw data tile data back together for displaying, which is 12x faster now.

That’s basically down to multiple occurrences of the following two issues:

std::vector is very picky when it comes to choose to move rather than to copy its content, e.g. when growing. Content being movable isn’t enough, the move operations also need to promise no exceptions being thrown by them using noexcept. In particular, this is not necessarily done by implicitly generated move operations, so even just explicitly declared but defaulted move operators/constructors can make a big difference.
Don’t do insertions (or removals) on a sorted vector in a loop. Collect all those changes and apply them in bulk instead (using the erase-remove idiom, or append new elements and sort once in the end). The algorithmic complexity is O(n²) for the former and O(n log n) for the latter. For the amount of data we are working with here that is a relevant difference.

Geometry reassembly became easy to measure and optimize in isolation as we now have a standalone command-line raw data tile reassembly tool. That’s part of the longer term goal to evolve KDE’s historically somewhat Marble-specific raw data tile infrastructure to something more in line with OSM standards and thus more useful for a wider range of applications.

And there is interest in that, but especially the exact selection/clipping and reassembly procedures need work (and documentation).

Data questions

I also managed to get input on two data/data modelling questions we ran into with the train station map:

How does the meaning of the level tag propagate to geometry in a multi-polygon relation, e.g. if inner polygons are themselves again used to represent different elements? Turns out it is not supposed to do that automatically, inner geometry needs to be tagged separately. That then fortunately required just a few fixes in the data, rather than adding complex logic to the floor level separation code.
The uic_ref tag on German railway stations seems to consistently contain the very similar looking IBNR number rather than the UIC station number it is supposed to contain. I had hoped I had missed something here, but it looks like this will indeed need a mass modification fix after all.

Tools and workflows

I am still quite new to the OSM world, so hanging out with experienced OSM people is also always a good opportunity to learn about their tools and workflows.

While I had been using JOSM for online data editing before I hadn’t realized you could also save modified extracts locally, in a format we can already load anyway even. That is a massive help for testing both code and data changes, and for creating dedicated test cases.

This wasn’t entire obvious to discover in the Flatpak version of JOSM, as that has no host filesystem access and no proper file access portal integration. Switching on native file dialogs (Edit > Preferences > Display > Use native file choosers) and enabling host file system access in the Flatpak KCM works around that.

Infrastructure coordination

While our source code and data might be free, the infrastructure to distribute that costs real money. For map data in KDE applications those are the raster tiles servers and geo coding services from OSM, and the elevation and raw data tile servers from KDE.

OSM serves close to 20.000 raster tiles per second on average (about 70 of those for Marble or Mable-based applications), KDE’s more special-purpose tiles are requested a bit less than 3 times a second.

At that scale it’s important to ensure all applications are well-behaved, ie. properly identify themselves, minimize requests and use the most efficient way available to retrieve the data they need.

Most KDE applications using map data have received some form of compliance fixes in the past weeks for this, if your application didn’t please get in touch.

But even the most efficient use doesn’t eliminate the need for powerful infrastructure, and this is something you can support with your donations to organizations like KDE and OSM.

Outlook

It’s not long until the next in-person meeting with OSM people, I’ll be speaking at the FOSSGIS conference in Berlin in just ten days from now, about how we use OSM data in KDE Itinerary (in German).