Nürnberg Sprint and KDE Itinerary Browser Integration

Getting everyone interested/involved in a specific area into a room for a few days with no distraction is a very effective way to get things done, that’s why we do sprints in KDE since many years. Another possible outcome however can be that we end up with some completely unexpected results as well. Here is one such example.

Sprint

Last weekend we had the combined KDE Connect, Streamlined Onboarding and KWin sprints in Nürnberg. SUSE kindly provided us with rooms in their office (as well as coffee and the occasional pizza), and KDE e.V. had made sure we didn’t have to sleep on a park bench somewhere.

I really like the recent trend of “sprint pooling”, that is combining a few more or less independent topics at the same time and location. This does not only reduce travel overhead, it also helps to avoid silo teams and fosters cross-team collaboration. While it’s not a full replacement for doing this at a larger scale at the Randa Meetings, it’s a massive improvement over isolated sprints.

And it made me attend three sprints I’d probably not attended individually, as I’m not deeply involved enough in either topic. That however turned out very much worth it.

KDE Itinerary Browser Integration

While there were many important discussions, achievements and decisions as other reports on Planet KDE show, I’d like to talk about one in particular here, the experiments on extracting data relevant for KDE Itinerary via a browser extension. You might notice that this isn’t in scope for any of the official sprint topics, but that’s exactly what happens when you bring together people from different areas, in this case Kai from Plasma Browser Integration and myself from KDE Itinerary.

KDE Itinerary so far gathers most of its data in the email client, Kai’s idea was to also consider the browser for this. That makes sense, as it’s where you do most of the actual booking operations. However, unlike email website can be very dynamic making them hard to capture and analyze to get an idea who viable that idea actually is.

Having someone knowing how to develop browser extensions and someone knowing the structured data patterns we need to look for sitting next to each other for a day for this enables quite some progress. As a start we (and by we I mean Kai) wrote a small browser extension that would look for the schema.org and IATA BCBP patterns we could generically extract. If those would be found in sufficient numbers we actually would have a viable way to get relevant data, without needing a completely unscalable web scraping approach.

My initial expectation was we’d need to run this extension on a couple of machines for a few weeks until we had some initial numbers. It turned out I was very very wrong. The extension almost immediately started to report results, it looks like the majority of hotel chain websites and hotel booking sites have schema.org annotations, as well as many restaurant and event booking sites, next to a relevant number of individual hotel and restaurant sites. So, definitely worth pursuing this approach.

Of course the actual development work only just starts now, and there is still a lot of work ahead of us to get this to a point where it provides value, but we have come up with an approach and validated it in a tiny fraction of the time it would have taken any one of us individually.

Contribute

If you find the idea of a Free Software and designed for privacy digital travel assistant appealing but so far have shied away from helping out because you are more familiar with web technologies than C++, the browser integration is a great way to get in :)

Another way to help is by enabling KDE e.V. (and its sister organizations like the Krita Foundation) to support such activities, financially by donations, by becoming a supporting member or a corporate sponsor, or logistically by providing a suitable venue for such events for example.