Jun 8 at 19:29
This is an older version of the article. Click here to see the current one.

Rethinking writing files with memory mapping and C++

This post introduces the motivation behind the decodless C++ open source libraries. I hope someone finds them useful.

I’ll begin with an example. I do 3D graphics and triangle meshes are canonical in the field. There are so many file formats but a few just stick around because they’re so amazingly simple and easy to use. I’ll call out *.obj as a very well known one. It’s text based, which is great to directly play with and edit, but it’s relatively slow to parse — lots of atof() calls to read floating point numbers from text. Lets make our own binry format for a simple triangle mesh.

Example file format

As with most file formats, it’ll start with a header. For simplicity, this is just going to be a plain old data struct and we’ll assume a similar architecture machine both reads and writes the file. E.g. it must be little endian, but that’s very common.

struct Header {
    uint64_t triangle_indices_offset;  // bytes from start of file
    uint64_t triangle_indices_count;  // elements of uint32_t
    uint64_t vertex_positions_offset;  // bytes from start of file
    uint64_t vertex_positions_count;  // elements of float[3]

Nice and simple. Not even a magic file identifier at the top. Actually, I can see a nice way to factor the above a bit…

struct Span {
    uint64_t byte_offset;
    uint64_t element_count;

struct Header {
    Span triangle_indices;
    Span vertex_positions;

The existing way

Hmm, lets quickly search to see how we can write a file in a cross platform way:

  1. C fopen()/fwrite() from stdio.h
  2. C++ ofstream from fstream
  3. That’s it??

These are basically the same - writing a stream of data. We have to allocate some memory first, fill that memory, then copy that memory to disk using fwrite() or ofstream::write(). Lets write some code to write our file.

void writeHeader(std::ofstream& file, Header header);
Span writeIndices(std::ofstream& file, uint32_t* indices, uint64_t count);
Span writePositions(std::ofstream& file, float* positions, uint64_t count);

void test()
    uint32_t indices[] = { ... };
    float    positions[] = { ... };

    std::ofstream file("cube.mesh");

    Header header; 
    writeHeader(file, header);
    header.triangle_indices = writeIndices(file, indices, std::size(indices));
    header.vertex_positions = writeIndices(file, positions, std::size(positions) / 3);

    // crap. we wrote header before initializing the members
    //writeHeader(file, header); // no, the header should probably be first

    // Go back and re-write the header
    // TODO: clean up this mess!
    // maybe change the first writeHeader() to reserveHeader()?
    writeHeader(file, header);

Oh no. How can we fix the above? Yeah, yeah, we’ve all hit this problem before. You just™…

  1. Figure out how much data you’re going to use first
  2. Write the header at the end of the file
  3. Leave a placeholders and come back and fill them in afterwards

Lets take a moment to stop and think about more complex examples with these approaches. Currently there is just one mesh in the file. What if we want multiple meshes. That means that option (1.) would involve recursively walking the data structure you’re going to write twice. That’s pretty complicated to code and easy to go badly wrong if the predicted size is incorrect. (2.) is appealing because it’s simple but it’s also unexpected to have headers last. It also can’t solve the case where references to data form a graph and not a tree. (3.) is nicer for the code reading the file, although there’s some housekeeping to first reserve blocks of the file and return to write them later.

That’s it then. Problem solved. Unless… one more thing to investigate…

Writing files with memory mapping

This is pretty out there. Specialized applications have been doing this for a long time now. Maybe we can make it easier?

Memory mapping is a comlicated topic especially for someone that just wants to write a basic file. If there wasn’t a library (spoilers!) it’s probably not worth learning. But since you’re here already, what are the benefits?

You still have to write data to RAM and it still gets copied to disk, but…

  1. There is no std::ofstream overhead
  2. Nicer programming interface - write to a pointer instead of a stream

These both allow you to “map” the bytes of a file on disk to a virtual address range in your program’s memory. But what does this actually give you?

How does it work? RAM is given to processes in “pages” of bytes. Pages are mapped into processes’ virtual address space - this is what you get when you call malloc() or new. File memory mapping allows you to map these pages to a file on disk. The OS and CPU automatically handle keeping the data in sync. To the programmer it looks like they’re writing to regular memory but at the end, the OS copies modified pages to disk. Pages may be “paged out” automatically so you don’t need to have enough RAM for the whole file. Moreover, you can read from the mapped memory too!

So what’s the catch? Well, a few and I have solutions to all.

  1. Memory mapping is complicated to understand and use. As of writing there is no standard library for memory mapping, but thre are equivalent interfaces. Linux has mmap() and windows has CreateFileMappingA(). A cross platform library with a simple API could hide the comlexity.
  2. You typically need to create the file with a fixed size first, then memory map it. This means you’d need to know exactly how many bytes you want to write up-front, which IMO is a deal breaker. There are ways to grow memory mappings but that’s even more complicated. Again, just something that can be handled by a library.
  3. The stream APIs automatically increment a write pointer. When using memory mapping we’d need to replace this feature. Actually, linear memory allocators do exactly this, and there’s an amazing side effect that mapped memory is then natively readable. Lets have a library for that too.

Growable memory mapping

Enter decodeless::mappedfile for easy and growable memory mapping. Lets take a look at what the example code can look like now:

Header* writeHeader(decodeless::resizable_file& file, void*& writePtr);
Span writeIndices(decodeless::resizable_file& file, void*& writePtr, uint32_t* indices, uint64_t count);
Span writePositions(decodeless::resizable_file& file, void*& writePtr, float* positions, uint64_t count);

void test()
    uint32_t indices[] = { ... };
    float    positions[] = { ... };

    decodeless::resizable_file file("cube.mesh", 9999 /* max. size */);
    void* writePtr = file.data();

    Header* header writeHeader(file, writePtr);
    header->triangle_indices = writeIndices(file, writePtr, indices, std::size(indices));
    header->vertex_positions = writeIndices(file, writePtr, positions, std::size(positions) / 3);

Allocating data in a mapped file

Enter decodeless::allocator, a basic linear allocator library that supports growing a backing decodeless::resizable_file. These classes are combined into decodeless::writer for convenience:

// With the pmr allocator, these functions don't need to know
// anything about decodeless or where they are writing their data.
Header* writeHeader(std::pmr::polymorphic_allocator alloc);
Span writeIndices(std::pmr::polymorphic_allocator alloc, uint32_t* indices, uint64_t count);
Span writePositions(std::pmr::polymorphic_allocator alloc, float* positions, uint64_t count);

void test()
    uint32_t indices[] = { ... };
    float    positions[] = { ... };

    decodeless::pmr_writer file("cube.mesh", 9999 /* max. size */);

    Header* header writeHeader(file.allocator());
    header->triangle_indices = writeIndices(file.allocator(), indices, std::size(indices));
    header->vertex_positions = writeIndices(file.allocator(), positions, std::size(positions) / 3);

Convenience utilities

There’s one final library to show off and that’s a relative pointer decodeless::offset_ptr. This can replace Span above. The only difference is that byte_offset is now an offset from the relative pointer’s own address in the file (or memory) and saves having to pass around a pointer to the start of the file. There are also some utility allocation methods that can replace our write*() functions for an even shorter demo.

struct vec3f { float x, y, z; }

Header {
    decodeless::offset_ptr<uint32_t> triangle_indices;
    decodeless::offset_ptr<vec3f> vertex_positions;

void test()
    uint32_t indices[] = { ... };
    vec3f    positions[] = { ... };

    decodeless::file_writer file("cube.mesh", 9999 /* max. size */);

    Header* header = file.create<Header>();
    header->triangle_indices = file.createArray(indices);
    header->vertex_positions = file.createArray(positions);

For the pièce de résistance, see how convenient reading the above file can be!

void test_read()
    decodeless::file file("cube.mesh");
    const Header* header reinterpret_cast<const Header*>(file.data());
    uint32_t firstIndex = header->triangle_indices[0];
    vec3f    firstIndexPosition = header->vertex_positions[firstIndex];


TLDR: check out the decodeless library collection: github, twitter