__attribute__((packed)) on Windows is ignored (with MinGW)!

tl;dr by default, MinGW on win32 behaves as if it ignores __attribute__((packed)), you need to add -mno-ms-bitfields to make it work as expected.

Some days ago, I was tasked with a port of a simple networked application to Windows 7. There is a small issue, though: to serialize the messages, the code filled a packed structure, that is then memcpy-ed into a buffer and then thrown into the TCP stack. Don't blame me, I trusted a bad advice, I was young and stupid!

We may get away with it (it works, somehow!). The big problem here is that everything falls apart if the structure layout is not like we expect it to be! In the best case, the application segfaults because it tried to access some memory we should not access, or some internal checks fire because the values are so wrong they simply are out of range.

Let's say I'm passing this packed structure around...

typedef struct _mypackedstruct {
  uint32_t magic_number;
  uint8_t  version;
  uint32_t interesting_property;
  uint64_t creation_time_sec;
  uint64_t creation_time_usec;
  uint32_t first_info_no;
  uint32_t second_info_no;
} __attribute__((packed)) MyPackedStructure;

... and the code receives and parses the serialized structure, as in the following pseudo-C++ example

// a very simplified sample of the code that highlighted the problem
MyPackedStructure destination;
char* buffer = new char[sizeof(destination)];

// read some data from a previously created socket
// and push them into a buffer
socket.recv_message(buffer);

// push the unparsed data into destination
// to perform an automatic "deserialization".
// Don't try this at home. Please, use serialization libraries.
memcpy(&destination, buffer, sizeof(destination));

// perform validity check on the structure's content.
assert(is_packet_ok(destination));

On various Linux distros, everything works fine. Once I built the project on Windows, the is_packet_ok assert fired. Why?

After looking at the code, trying to understand the problem, I had the idea of checking the size of the structure. On Windows (using MinGW), sizeof(MyPackedStructure) returns 40, while on Linux (using three different compilers: gcc 4.6.3, gcc 4.9.2, clang/llvm 3.8) the same expression returns 33. Seven bits sure do a big difference! I want you to focus your attention on that 40. It's easy to recognize that 40 mod 8 = 0: the structure was aligned to 8 bytes. Why was the packing attribute ignored?

A simple search on the Wired showed me that this strange behavior is known since April 2012, but the ticket is still marked as new. In the comments, you'll find a workaround: set -mno-ms-bitfields.

What is this -mno-ms-bitfiels option? The 6.36.5 i386 Variable Attributes section of the manual says

"If packed is used on a structure, or if bit-fields are used, it may be that the Microsoft ABI lays out the structure differently than the way GCC normally does. Particularly when moving packed data between functions compiled with GCC and the native Microsoft compiler (either via function call or as data in a file), it may be necessary to access either format."

[snip]

"2. Every data object has an alignment requirement. The alignment requirement for all data except structures, unions, and arrays is either the size of the object or the current packing size (specified with either the aligned attribute or the pack pragma), whichever is less. For structures, unions, and arrays, the alignment requirement is the largest alignment requirement of its members. Every object is allocated an offset so that: offset % alignment_requirement == 0"

What does it mean? It means that, by default, MinGW uses the Microsoft algorithm to calculate packed structures requirements, and it doesn't work as we'd like.

The expected behavior (as described in the same link!) is:

packed
The packed attribute specifies that a variable or structure field should have the smallest possible alignment—one byte for a variable, and one bit for a field, unless you specify a larger value with the aligned attribute.

To explain it, I will write a simple program that creates a packed structure and initializes it with known values. The program will be later inspected with GDB in order to check the actual memory layout of the structure.

// file: main.cpp
// MyPackedStructure is the same structure we defined at the start of the blog post
int main(){
  // print the size of the structure
  std::cout << "sizeof(MyPackedStructure) = " << sizeof(MyPackedStructure) << std::endl;

  // prepare a structure variable...
  MyPackedStructure obj;
  obj.magic_number         = 0xa5a61ff5;
  obj.version              = 2;
  obj.interesting_property = 0x33;
  obj.creation_time_sec    = 1462224873;
  obj.creation_time_usec   = 4141457;
  obj.first_info_no        = 5;
  obj.second_info_no       = 2;

  // ... and put a breakpoint here to inspect its memory!
  return 0;
}

I built and ran the code on ArchLinux (x86, GCC 5.3.0, GDB 7.11) and...

[~/projects/experimental/sizeof_packed_struct]> g++ -o test main.cpp -g # let's build it with the latest GCC on ArchLinux, x86
[winter@timeofeve] [/dev/pts/3] [master]
[~/projects/experimental/sizeof_packed_struct]> gdb ./test
Reading symbols from ./test...done.
(gdb) break main.cpp:29
Breakpoint 1 at 0x804875c: file main.cpp, line 29.
(gdb) r
Starting program: /home/winter/projects/experimental/sizeof_packed_struct/test
sizeof(MyPackedStructure) = 33

Breakpoint 1, main () at main.cpp:29
29        return 0;
(gdb) x /33x &obj  # let's dump 33 bytes of memory from the address of `obj`
0xbffff5af:     0xf5     0x1f    0xa6    0xa5    0x02   0x33    0x00    0x00
0xbffff5b7:     0x00     0xe9    0xc7    0x27    0x57   0x00    0x00    0x00
0xbffff5bf:     0x00     0x91    0x31    0x3f    0x00   0x00    0x00    0x00
0xbffff5c7:     0x00     0x05    0x00    0x00    0x00   0x02    0x00    0x00
0xbffff5cf:     0x00

... the memory is laid out as we'd expect it: we can see the magic number in the first 4 bytes, version occupies only the fifth byte. interesting_property is unaligned: the field is 4 bytes long, but the first 3 bytes are in the first row (0xbffff5af), and the last one lies in the second row (0xbffff5b7). Please note that MyPackedStructure, on ArchLinux is 33 bytes long.

If we run it on Windows 7 (x86_64, mingw 4.9.3, gdb 7.6.1, PowerShell x86), we get...

PS C:\Users\WinterHarrison\Desktop> g++ main.cpp -o test -g
PS C:\Users\WinterHarrison\Desktop> gdb .\test
Reading symbols from C:\Users\WinterHarrison\Desktop\test.exe...done.
(gdb) break main.cpp:29
Breakpoint 1 at 0x401498: file main.cpp, line 29.
(gdb) r
Starting program: C:\Users\WinterHarrison\Desktop/.\test.exe
[New Thread 164.0x5c8]
sizeof(MyPackedStructure) = 40

Breakpoint 1, _fu0___ZSt4cout () at main.cpp:29
29        return 0;
(gdb) x /40x &obj
0x28ff08:        0xf5    0x1f    0xa6    0xa5    0x02   (0xff    0x28    0x00)
0x28ff10:        0x33    0x00    0x00    0x00   (0x53    0x40    0x0e    0x64)
0x28ff18:        0xe9    0xc7    0x27    0x57    0x00    0x00    0x00    0x00
0x28ff20:        0x91    0x31    0x3f    0x00    0x00    0x00    0x00    0x00
0x28ff28:        0x05    0x00    0x00    0x00    0x02    0x00    0x00    0x00

... that MyPackedStructure is now 40 bytes long. The fact that my Arch is 32bit doesn't mean anything, as I'm using fixed-size integers.

Please also note that there are no unaligned fields: interesting_property now starts at the first byte of the second row (0x28ff10).

This fact is the real cause of the problem I described at the start of this post: memcpy writes important (and correctly packed!) data inside the padding bytes, that are then not considered nor accessible. When you try to read from the structure, you will get some junk with no meaning at all. Just a simple example: after the memcpy in the pseudocode at the top of the article, we'd expect obj.interesting_property to be 0x33, but it's 0 on Windows. 0x33 would be where we now can see 0xff (first row, sixth byte), but that particular address is not accessible by the client code. Well, I may access it doing some pointer arithmetics, but isn't the point of using a packed structure to avoid pointer arithmetics at all?

Let's now apply the suggested workaround: tell GCC "whatever, I don't care about this MS bitfields algorithm, please do as you do on Linux" using the -mno-ms-bitfields compiler flag.

PS C:\Users\WinterHarrison\Desktop> g++ main.cpp -o test -g -mno-ms-bitfields
PS C:\Users\WinterHarrison\Desktop> gdb .\test
Reading symbols from C:\Users\WinterHarrison\Desktop\test.exe...done.
(gdb) break main.cpp:29
Breakpoint 1 at 0x401498: file main.cpp, line 29.
(gdb) r
Starting program: C:\Users\WinterHarrison\Desktop/.\test.exe
[New Thread 832.0xaf8]
sizeof(MyPackedStructure) = 33

Breakpoint 1, _fu0___ZSt4cout () at main.cpp:29
29        return 0;
(gdb) x /33x &obj
0x28ff0f:       0xf5    0x1f    0xa6    0xa5    0x02    0x33    0x00    0x00
0x28ff17:       0x00    0xe9    0xc7    0x27    0x57    0x00    0x00    0x00
0x28ff1f:       0x00    0x91    0x31    0x3f    0x00    0x00    0x00    0x00
0x28ff27:       0x00    0x05    0x00    0x00    0x00    0x02    0x00    0x00
0x28ff2f:       0x00

Both the size and the memory dump look like the Linux size and dump. This is exactly what we wanted! Everything works now! Mission accomplished, let's call it a day!


So, what did I learn from this experience?

  • Use -mno-ms-bitfields if you want to use packed structures on Windows (using MinGW)
  • Serialization libraries (cereal, messagepack, protocolbuffers, ...) exist for a reason: to avoid writing similar blog posts and to ensure that messages are serialized and deserialized in the same way on every platform
  • Don't think that your code will run the same everywhere: always have a test suite (and don't make it a pain to build and run on a new target platform!)
  • Read your compiler's documentation, especially if you're using language extensions

That's all for today. Let me know if you found those notes useful - if you like them, please consider offering me a Ko-fi to let me keep writing!