Automatic C++ comparison operators

C++ comparison operators are usually fairly straightforward to implement. Writing them by hand can however be quite error prone if there are many member variables to consider. Missing a single one of them will still compile and mostly work fine, apart from some hard to debug corner cases, such as misbehaving or crashing algorithms and containers, or data loss. Can we do better?

Background

In KItinerary we have a number of classes representing schema.org types. Those are fairly straightforward value types with a number of properties, to be consumed both statically (by C++ code) and dynamically (by QML, Grantlee and JSON-LD de/serialization).

As implementing getters, setters, Q_PROPERTY statements, member variables, etc for about a hundred or so properties by hand would result in an unreasonable amount of boilerplate code, this is all done by a macro (example).

So far so good, but now we’d like to add comparison operators for those types. Specifically we needed operator== for optimizing away some memory allocations in case of non-changing write operations (similar to the common pattern of not emitting change signals in setter methods on non-changes). But the thoughts below are of course also applicable to any other comparison function, not just equality.

Implementation

Ideally we’d find an alternatives to writing the comparison functions by hand that would either be impossible to break or would at least not fail silently (e.g. by producing compiler errors).

QMetaProperty

One idea would be to implement the comparison functions entirely generically by leveraging the property iteration support in QMetaObject. That is we iterate over all properties with the STORED flag and compare their values. This gets the job done, but has two drawbacks:

Comparison happens via QVariant, which means we have to register comparison operators for all types with the meta type system. That might actually be nice to have anyway, but it is limited to less-than and equal.
All property values are passed through QVariant, which in some situations can have a relevant performance impact, in particular when using types that are too large for inline storage inside QVariant (causing allocations).

More preprocessor magic

Another idea could be to use more elaborate preprocessor constructs that allow for iteration over all properties. The Boost.Preprocessor library has the building blocks for this. From experience in an old pre-C++11 project attempting to catch SQL errors at compile time this works but doesn’t really lead to a nice syntax nor easily maintainable code.

A clever overloading trick from Verdigris

The solution I ended up using is inspired by Woboq’s Verdigris. The basic idea is that each property macro generates a part of the comparison function only, the comparison for its property and a call to the comparison function for the previous property. The chaining is done by overloading on a template type that essentially describes a numerical value by inheritance. A little constexpr helper functions allows us to determine the index of a property at compile time. Olivier describes this in more detail in his blog post about Verdigris implementation details.

The resulting code for KItinerary can be found here. It’s worth noting that while this might look inefficient due to the many function calls, this is all inline code in a single translation unit, so in an optimized build this ends up essentially as the hand-written comparison function would look like.

Language support?

Would it make sense to have C++ support something like bool operator==(const T&) const = default, that is let the compiler generate the implementation for us, as it can for a number of other member functions? Proposals for such a language extension exist.

There’s a bigger conceptual problem though, that one runs immediately into once the comparison operator is implemented: the semantics of comparing things. Here are a few examples:

Floating point numbers: how close together is “equal” depends on what those numbers represent, in my case of geo coordinates anything within a few meters is certainly more than enough. And there is the little detail that NAN does not compare equal to itself, which isn’t what KItinerary needed either, as NAN is its indicator for “value not set”.
QString: here the default equal comparison doesn’t distinguish between null and empty strings. That’s probably very often what you’d expect, unless you put special semantics on that distinction, such as in the NAN case above.
Date/time values: two QDateTime instances compare equal if they refer to the same point in time, not if they represent exactly the same information (e.g. time specified with an UTC offset vs a full IANA timezone). For KItinerary timezones are a very crucial information, so the default semantic doesn’t cut it there.

An all-or-nothing approach for compiler-generated comparison operator implementations means it’s not a viable option in case one needs more control over the semantics. That of course does not mean it is useless, but it does mean the alternative implementation techniques remain valid either way.