(2023-04-13) On serialization of binary chunks
----------------------------------------------
Today is a Clean Thursday, and my post will be much shorter and cleaner than
any of the recent ones.

If you need a really safe way to store or transfer arbitrary binary data
chunks, just use djb's netstrings ([1]) which are constructed like this:

[length]:[content],

Here, the trailing comma is mandatory and the [length] is decimal, with
starting zero prohibited unless the length is 0 (so, an empty string is
encoded into three characters '0:,').

The decoding algorithm is dead simple as well:

1. Read everything until the first : byte as a decimal number L. If L is not
a valid non-negative decimal integer number (or starts with 0 while being
positive), report an error and halt.
2. Allocate a buffer B which is L bytes long. If allocation is impossible,
report an error and halt.
3. Read the next L bytes after the first : byte into the buffer B. If reading
L bytes is impossible, report an error and halt.
4. Read the (L+1)-th byte. If it's not equal to the ',' byte, report an error
and halt.
5. Return the buffer B as the result. End of algorithm.

Why is the trailing comma needed if we already specify how many bytes to
read? I asked this question myself some time ago. The comma being in the
expected place after the chunk is an additional (although not fully
surefire) marker that no bytes were lost from the chunk during the transfer
and no extra bytes were added, and the actual chunk length is exactly as
declared.

I really can't praise this format high enough. I recommend using it as a
top-level wrapper for whatever message format you employ in your scripting,
desktop, mobile, server, embedded applications. Even if your messages are in
plaintext, they will remain in plaintext, but now much safer. Netstrings
introduce so little overhead while adding so much value that they really
help to make the world a bit better place. Just use them wherever you can.

And no, I'm not confusing marshalling with serialization. And neither should
you.

--- Luxferre ---

[1]: https://cr.yp.to/proto/netstrings.txt