(2023-09-25) How I started learning OCaml... and paused it
----------------------------------------------------------
Two posts ago, I told that I would start looking at three native-compiled
(that is, no VM) programming languages as potential C replacement
candidates. So, OCaml became my first guniea pig of choice. And, to tell you
the truth, this is the language I really had the highest hopes for. And I
really can recommend it to everyone. No matter what they say, it's really
easy to pick up and start writing real-life code. It's also relatively easy
to compile fully static binaries using it. You can see the Makefile of my
OSmol server ([1]) for an example of how to do this. Yes, OCaml is so easy
that I was able to write a simple and robust Gopher server (that's powering
this very Gopherhole right now!) within <45 SLOC and about 2 days of digging
through courses and manuals. And the binary of this server doesn't require
any runtime dependencies on my VPS, as it's fully self-contained and
statically linked with musl libc. I couldn't be happier with the results of
my preliminary tests...

..Except one "minor" issue. Can you guess the final binary size?

979184 bytes.

I'm not kidding. This is with musl — with glibc, it's well over 1.6 megs. And
I had installed Flambda optimizer and used all kinds of optimization tricks
on the musl-gcc and/or zig cc sides as well. If anything, I was expecting
around 45-50K, as this is the usual size of a static musl linked binary that
is that simple. Although I *kinda* guess what's going on there, it's obvious
that the OCaml compiler does absolutely nothing to remove unused code
present in the standard library and included ".cmxa"s from the final binary.
I looked around on teh interwebz and found nothing significant on the topic
except some "post-link optimization frameworks" that don't actually change
the bigger picture much in terms of executable size, and the mere existence
of such frameworks shows that the current OCaml implementation just wasn't
designed with static linking and _low-level_ dead code elimination in mind.
For a Gopher server, that's kinda OK, but for my entire spectrum of intended
language purposes, that's just unacceptable. I want my programs to be able
to run in RAM- and disk-limited environments, where even a phone with 256 MB
RAM is not the worst case scenario, so a megabyte per single process image
(not even a whole process!) is too costly, regardless of how awesome the
language itself is.

That's why I decided to put learning OCaml on hold and move on to my next
candidate. It will take some time as well, so I guess I'll share my thoughts
on it in two weeks or so.

--- Luxferre ---

[1]: https://git.sr.ht/~luxferre/OSmol