Automatic C++ comparison operators
C++ comparison operators are usually fairly straightforward to implement. Writing them by hand can however be quite error prone if there are many member variables to consider. Missing a single one of them will still compile and mostly work fine, apart from some hard to debug corner cases, such as misbehaving or crashing algorithms and containers, or data loss. Can we do better?
Background
In KItinerary we have a number of classes representing schema.org types. Those are fairly straightforward value types with a number of properties, to be consumed both statically (by C++ code) and dynamically (by QML, Grantlee and JSON-LD de/serialization).
As implementing getters, setters, Q_PROPERTY
statements, member variables, etc for about a hundred or so properties by hand would result
in an unreasonable amount of boilerplate code, this is all done by a macro
(example).
So far so good, but now we’d like to add comparison operators for those types. Specifically we needed operator==
for
optimizing away some memory allocations in case of non-changing write operations (similar to the common pattern of not emitting
change signals in setter methods on non-changes). But the thoughts below are of course also applicable to any other comparison function,
not just equality.
Implementation
Ideally we’d find an alternatives to writing the comparison functions by hand that would either be impossible to break or would at least not fail silently (e.g. by producing compiler errors).
QMetaProperty
One idea would be to implement the comparison functions entirely generically by leveraging the property iteration support
in QMetaObject
. That is we iterate over all properties with the STORED
flag and compare their values. This gets the job
done, but has two drawbacks:
- Comparison happens via
QVariant
, which means we have to register comparison operators for all types with the meta type system. That might actually be nice to have anyway, but it is limited to less-than and equal. - All property values are passed through
QVariant
, which in some situations can have a relevant performance impact, in particular when using types that are too large for inline storage insideQVariant
(causing allocations).
More preprocessor magic
Another idea could be to use more elaborate preprocessor constructs that allow for iteration over all properties. The Boost.Preprocessor library has the building blocks for this. From experience in an old pre-C++11 project attempting to catch SQL errors at compile time this works but doesn’t really lead to a nice syntax nor easily maintainable code.
A clever overloading trick from Verdigris
The solution I ended up using is inspired by Woboq’s Verdigris. The basic idea
is that each property macro generates a part of the comparison function only, the comparison for its property and a call to the
comparison function for the previous property. The chaining is done by overloading on a template type that essentially describes
a numerical value by inheritance. A little constexpr
helper functions allows us to determine the index of a property at compile
time. Olivier describes this in more detail in his blog post about Verdigris implementation details.
The resulting code for KItinerary can be found here. It’s worth noting that while this might look inefficient due to the many function calls, this is all inline code in a single translation unit, so in an optimized build this ends up essentially as the hand-written comparison function would look like.
Language support?
Would it make sense to have C++ support something like bool operator==(const T&) const = default
, that is let the compiler
generate the implementation for us, as it can for a number of other member functions? Proposals
for such a language extension exist.
There’s a bigger conceptual problem though, that one runs immediately into once the comparison operator is implemented: the semantics of comparing things. Here are a few examples:
- Floating point numbers: how close together is “equal” depends on what those numbers represent, in my case of geo
coordinates anything within a few meters is certainly more than enough. And there is the little detail that
NAN
does not compare equal to itself, which isn’t what KItinerary needed either, as NAN is its indicator for “value not set”. QString
: here the default equal comparison doesn’t distinguish between null and empty strings. That’s probably very often what you’d expect, unless you put special semantics on that distinction, such as in theNAN
case above.- Date/time values: two
QDateTime
instances compare equal if they refer to the same point in time, not if they represent exactly the same information (e.g. time specified with an UTC offset vs a full IANA timezone). For KItinerary timezones are a very crucial information, so the default semantic doesn’t cut it there.
An all-or-nothing approach for compiler-generated comparison operator implementations means it’s not a viable option in case one needs more control over the semantics. That of course does not mean it is useless, but it does mean the alternative implementation techniques remain valid either way.