Getting OSM Indoor Map Data for KDE Itinerary

Last week I wrote about train station and airport maps for KDE Itinerary. One important challenge for deploying this is how to get the necessary OpenStreetMap data to our users, a prototype that requires a local OSM database doesn’t help with that. There’s currently two likely contenders, explained below.

Determining Relevant Areas

Before we obtain the map data we have to solve another problem first: which area do we actually want to display? In typical map applications the area that is presented is usually not constraint, you can scroll in any direction as long as you want. That’s not what we need here though, we are only interested in a single (large) building.

Constraining the map display to that area has a number of advantages, such as having a well-defined memory bounds. Even a big station mapped in great detailed fits in a few hundred kB in OSM’s o5m binary format. That avoids the need of any kind of tile or level of detail management you’d usually need on global scope.

It also means working with “raw” OSM data is feasible (which we need to enable all the features we want), we don’t need to reduce the level of detail of the data when sufficiently constraining the area.

For now we have a reasonably well working heuristic that takes care of this.

Marble Vector Tiles

Since the data for the entire world is about 60GB, we obviously need something that breaks this down into much smaller chunks. Fortunately, one such system already exists, within KDE’s infrastructure even, Marble’s OSM vector tile server.

These tiles are provided in OSM’s o5m binary format, and don’t contain any application-specific pre-filtering, making them extremely versatile, and therefore perfect for our use-case. On the highest available zoom level (which contains 2¹⁷ subdivisions per dimension), we need typically 9-12 tiles for a large station, so this also provides a reasonable trade-off between overhead and download volume.

There’s unfortunately two major challenges with this.

Automatic Updates

The currently available tiles are slightly outdated, as we are lacking an automatic and continuous update process it seems. I somehow suspect that a full re-generation of the entire world at a high frequency is going to be too costly, so this will probably need some development work to consume OSM’s differential update files.

Doing this would however not only help us but also all other consumers of those files, such as Marble itself.

Geometry Reassembly

A side-effect of using tiled data is that the geometry in there is split along the tile boundaries. When used as-is, that leads to ugly and confusing visual effects as well as duplicated text labels. To some extend we are meanwhile able to re-assemble the split geometry, but it’s still far from perfect, and it needs more heuristics than one would want there.

You can observe similar issues in Marble itself when using its vector OSM map. It might be possible to aide this by some changes in the tile generator, which would then also benefit all other consumers in re-assembling the geometry.

Dedicated O5M Files

Should the above approach turn out to not be feasible or taking to long to implement and deploy, what could we do instead? The o5m file format works great, it’s compact and nevertheless allows mmap’ed zero-copy use of string data, so that’s something to keep. But instead of generating hundreds of millions of tiles, we could just generate individual files per airport/train station. That’s in the ten thousands, several orders of magnitudes less.

This would also need some development work, as we need a way to determine the bounding boxes for all relevant areas, and then an efficient way to cut out those areas from the full dataset. Doing this for a single area with the OSM command line tools takes about 20-30 minutes, doing this for multiple areas in one go would presumably scale significantly better.

The big downside of this is it’s limiting us to a fixed set of locations, and we end up with a special-purpose solution just for KDE Itinerary. So this is only the backup plan for now.

Outlook

If you have ideas for features or use-cases for this, or want to help, check out the corresponding workboard on Gitlab. I’ll try to write up some details about the declarative styling system next.