Value Struct - wrapping values for strong typing

Alessandro Balzano

2019-11-15 20:31

One of my favorite tricks of C++ is this one simple compile-time device: value structs. This trick consists of a struct that wraps its only field, which type is a primitive one (int, float, ...).

Everything seems the same...

Let's start with a very simple example: converting between radians and degrees.

#include <iostream>

float to_rad(float deg) {
    return deg * 3.14 / 180;
}

float to_deg(float rad) {
    return rad * 180 / 3.14;
}

You may wonder: what's wrong with this code? Nothing, the code is fine. The problem is... nothing can stop you from using a value representing a "degree" to be passed to a function that takes a "radiant".

int main() {

    float half_pi = 1.57;

    // `half_pi` is 90 degrees in radians
    float ninety_deg = to_deg(half_pi);
    std::cout << ninety_deg << std::endl;

    // 90... degrees or radians? the compiler doesn't care!
    float what = to_deg(ninety_deg);
    std::cout << what << std::endl;

    return 0;
}

Unfortunately, this issue is more widespread than you think, and I've seen it happen several times.

A variant of this problem shows up when a function fun(int caller_id, int message_id) has two parameters with the same type, and someone swaps them (fun(int message_id, int caller_id)). If the client code is not updated to reflect this change, the program may not work correctly anymore. The compiler, however, will not warn about this change!

Let's go back to the original problem: what if to_deg and to_rad accept and return a special type, instead of a raw float?

Wrapping values

Let's create two ad-hoc structures for our Degrees and Radians. Both structures will wrap a single float, that can be accessed directly.

struct Degrees {
    float value;
    Degrees(float v): value(v){}
};

struct Radians {
    float value;
    Radians(float v): value(v){}
};

These two structs look exactly the same, but they are different types, and the compiler will treat them as such. No mixing allowed, this time!

Radians to_rad(Degrees d){
    return d.value * 180 / 3.14;
}

Degrees to_deg(Radians r) {
    return r.value * 3.14 / 180;
}

int main() {
    Radians half_pi = Radians(1.57);

    Degrees ninety_degrees = to_deg(half_pi);
    std::cout << ninety_degrees.value << std::endl;

    // Degrees what = to_deg(ninety_degrees);       // COMPILE ERROR!
    // std::cout << what.value << std::endl;

    return 0;
}

Under the hood, nothing has changed

You may now think: "wait, are we going to create a lot of temporary objects? It's going to be slow!". Well, gcc and clang recognize this pattern, and will replace Degrees and Radians with the float value - as if those structures were never defined.

Let's compare the generated assembly code, generated by gcc 9.1.0 (compile flags: -O2). In this listing, we are going to examine the difference between the first, float-based version (left side) and the struct-based version (right side).

Note: section .LFE1544 and section .LFE1551 define to_deg(float) and to_deg(Radians). section .LFB1545 and section .LF1552 contain the code that is executed - the "real code"

.LFE1544:                                       |       .LFE1551:
        .size   _Z6to_radf, .-_Z6to_radf        |               .size   _Z6to_rad7Degrees, .-_Z6to_rad7Degrees
        .p2align 4                                              .p2align 4
        .globl  _Z6to_degf                      |               .globl  _Z6to_deg7Radians
        .type   _Z6to_degf, @function           |               .type   _Z6to_deg7Radians, @function
_Z6to_degf:                                     |       _Z6to_deg7Radians:
.LFB1545:                                       |       .LFB1552:
        .cfi_startproc                                          .cfi_startproc
        mulss   .LC2(%rip), %xmm0                               mulss   .LC2(%rip), %xmm0
        cvtss2sd        %xmm0, %xmm0                            cvtss2sd        %xmm0, %xmm0
        divsd   .LC0(%rip), %xmm0                               divsd   .LC0(%rip), %xmm0
        cvtsd2ss        %xmm0, %xmm0                            cvtsd2ss        %xmm0, %xmm0
        ret                                                     ret
        .cfi_endproc                                            .cfi_endproc

As we can see, the difference is just in the description of the function: the real code is the same. Both versions perform the same multiplications and divisions, and read the same registers.

Conclusion

In this blog post, we looked at value structs, that are structures that wrap a single field, and how we can use them to type-check our code and avoid mixing raw values. We also looked the assembly code, and noticed that the wrappers we introduced do not affect the performance of our code: they are compiled as if we just used the raw values.

If you feel that this article helped you, feel free to share it! If you have questions, ask on Twitter, or offer me a coffee to let me keep writing these notes!

References:

https://gist.github.com/alfateam123/042d37c2f944089e424bc01c7c569b1b - the source code used in this article.
https://www.youtube.com/watch?v=1fwbG5TyI18 - CppCon 2018: Arno Lepisk "Avoiding Disasters with Strongly Typed C++"