Hello again my fellow gopherites!

Hello again my fellow gopherites!

I've been saying for a while that I'd write a phlog post on
programming languages and garbage collection and such, and this is
that post :)

Programming languages are things that attract a lot of strong
opinions and as such I was kind of reluctant to phlog about this,
but there's been enough times that the subject comes up during
screwtape's fantastic anonradio show, that I'd like to elaborate
on a few things, as I can go into a little bit more detail in a
phlog post than I otherwise could in sdf com chat.

,=============================,
| First, a bit of background: |
`============================='

I've been programming for a while. I wrote my first lines of code
at a very young age (like 7 or 8) on a commodore 64. Like many who
started in that era, I have fond memories of typing listings out
of the fantastic c64 manuals and magazines that were around at the
time. I vividly remember the rush of excitement I felt when that
famous balloon sprite appeared on the screen, which also kicked of
a new hobby of drawing sprites on grid paper, adding a bunch of
numbers, entering them as DATA lines in basic, and creating custom
sprites. It was a lot of fun. Eventually I also got my first
introduction to simple assembly programming on the c=64 when I
realized the source code for a lot of games was just a single
'SYS' instruction. It made no sense until one day I accidentally
entered the monitor mode of the final cartridge and started poking
around. Eventually my mom brought home a decommisioned 386 PC from
work, which had Pascal on it (Just pascal, not turbo pascal), and
a new world of amazing potential opened. Eventually we got a 486,
and I upgraded to turbo pascal, and soon enough I got my very own
PC, a pentium 75 MHz, and started using c++. While professionally
I had been using various languages, between php, python, c++,
perl, etc,... for my personal projects I always preferred c++,
mostly because it strikes a balance between having the features I
like and being widely supported on enough platforms. Not so much
because I like it syntactically. In fact, the ever changing and
evolving idioms are something of nightmares, really,... :)

As screwtape mentioned in his last show, I did spend some quality
time with smalltalk, mostly because of the Croquet project
(later OpenCobalt), which was a very interesting and exciting
concept to me at the time. If you're not familiar, the idea was to
have a soft of interlinked multi-user programmable
3D-environments. You could drop 3d objects into the world, program
their behavior with smalltalk, in real time, and have other people
walk through your world, and worlds could be inter-connected by
portals you could walk through. It is actually remarkably similar
to vrchat today, if vrchat were user-scriptable in real-time. You
could live-code an object, drop it in the world, and everyone
would instantly see your hand-coded thing. Unfortunately, the
people behind it were looking to monetize it and started marketing
it more as some sort of corporate collaborate environment thing.
They made a vnc viewer window thingy where multiple users could
share desktop control in some sort of virtual meeting.
Incidentally, this was way before zoom and teamviewer and the
likes were a thing. But even though that was a neat use-case, it
was way under-selling the capabilities of the platform. But alas
the platform also had a lot of bugs, and it was difficult to get
going. Soon enough development started getting a little neglected,
and now it's just yet another ambitious project that died on the
smalltalk vine.

,============================================================,
| My language preferences and why they are the way they are: |
`============================================================'

Anyhow... my language preferences: I tend to prefer verbose,
strict and statically typed languages. The more problems that can
be caught at compile-time, the better. That's not to say that
run-time error handling and checking doesn't have it's place.
I'll elaborate in the sections below.

On strictness and static typing:

Strict static typing to me is a language feature that helps me
catch mistakes as early as possible (at compile-time) as opposed
to when it's too late (at run-time). If you've got some sort of
long-running software, say a daemon, that people rely on always
being available, a hard runtime crash can be a big problem, as it
would take the entire thing down. But, at the same time, if I were
to ignore all errors to keep the application running no matter
what, one could easily enter a state where things are kind of
broken, but nobody notices, because the application is still up,
and in the worst case scenario you end up losing or corrupting
data. That's arguably worse than it actually going down hard. Not
to mention it's harder to figure out what exactly broke this way.
If an application fails fast and hard, a backtrace ought to show
exactly where things went wrong. This makes things a lot nicer
for debugging. Obviously both hard and soft failures described
above are bad, and best avoided at all costs. Hence the preference
for compile-time checks over runtime-checks. That's not to say
there isn't value in all 3, depending on the situation, and as
such it's not uncommon for me to write my c++ programs in such a
way where if the software is compiled in Release mode, the main
function has a try-catch which logs errors and tries to keep
things up, where-as a debug build, fails hard and fast.
Being able to change that behavior with the preprocessor, is a
feature I quite like about C++ -- not that there isn't perils with
the approach as now you're technically producing 2 behaviorally
different programs from the same source code. However, language
wise, you could do worse, as there are many languages (like most
of the interpreted ones) where runtime checks are the only option.

There is also a maintainability and readability aspect to strict
typing. If it is mandatory to declare all variables with a type,
this informs the code reader of the intent of the programmer. You
can visually see what this thing is not just from it's variable
name, but also from it's type, which tells you STRUCTURALLY what
this thing is, not just conceptually by name (assuming the
programmer even gave the variable a descriptive name to begin
with). As such I strongly dislike languages with inferred types,
and I'm not terribly happy that modern c++ introduced the 'auto'
keyword, and I tend to use it very sparingly, if at all.

On readability and maintainability:

People often joke about APL, how it is near unreadable, and looks
like a bunch of hieroglyphics. What actually makes APL so
difficult to parse at sight, is not necesarily it's use of funky
symbols, imho, but rather it's density.
If a single character has a lot of meaning, especially wrt program
flow, it becomes very hard, if not impossible to visually see how
a program might behave when executed. This is also incidentially
why indentation makes code more readable, as execution branches
become visually distinct that way. Code density is the enemy of
readability and maintainability. As such, I like very verbose
languages that make execution branch blocks very visually
distinct.

On garbage collection:

I am mostly indifferent about gc. It is another tradeoff. I like
the C++ STL smart pointer objects quite a lot, as they not only
help manage memory, but also inform the reader of the code on the
ownership of an object, and in turn, the conceptual and structural
relationship between objects. Once you grasp when and how to use
shared pointers, unique pointers, and weak pointers, you don't
need gc at all. But there is a learning curve associated with these
objects and idioms. Similarly, while a gc might help the casual
programmer get going faster, at some point you will have to delve
into the guts of your garbage collector when you hit performance
issues while dealing with your massively parallel problem du-jour,
for instance, and you'll have to learn about it's different gc
algorithms, how to configure them, how and when to manually
trigger a gc, etc,... you will be having to learn all the details
and innards of the magic black box that is your gc. Either way it
is a learning curve, either way it is complexity. It's just moved
elsewhere and it's obfuscated. That doesn't negate that gc is
certainly useful and convenient for the general use case.
Generally I don't mind having it, provided it's got enough control
mechanisms to deal with the more uncommon situations.

On security:

Ever since the rust community was unleashed upon the interwebs, a
lot of people have unleashed their often strong, sometimes
uninformed, opinions about memory safety. While memory safety is
obviously important, it's also not the MOST important. Most modern
operating systems have various protection mechanisms such as ASLR,
stack overflow protection, etc,... that actual practical
exploitation of these bugs is very difficult if not impossible.
Nonetheless, bundling C++ in the same bucket with C on this is a
bit unfair. While the stl smartpointers don't make such bugs
impossible, they make them much more rare (you have to be really
be trying hard to be dumb^H^H^H^Hunsafe), between static code
analysis and ASAN, there are a lot of tools that will catch these
problems before they are a problem - you have to use them of
course. At the end of the day, any language that is flexible
enough to allow you raw memory access, is going to have some of
these problems. And as always, you are free to trade off some
flexibility for security, and vice-versa. Personally I think c++
strikes a good middle-ground in this. Not that c++ is a prime
example of a good language.

On OOP vs functional:

I guess I am also mostly indifferent on this, although I find it
useful to have both mechanisms available, and as such I think a
good language should offer facilities for programming in both
styles. Sometimes one is more suitable than the other. I tend to
lean more towards OOP, as it enforces a particular way to
structure code, that is easy to recognize, especially when reading
other people's software. I've had to work on some purely
functional software other people wrote, and you either have a lot
of functions calling other functions, or you have a lot of nested
closures. In the former, things don't look all that different from
the heavily criticized old basic spaghetti code. In the latter,
deeply nested closures also make it hard if not impossible to
figure out what's going on without a lot of reverse engineering.
That's not to say there's not clean ways to organize and write
functional code. I observe a lot of people with a math-y
background instead of a software background tend to favor
functional over OOP. If for these people a rats nest
of ((()))()))))((()())))())) is more readable to them, more power
to them! ;)

,===============================,
| Examples of languages I enjoy |
`==============================='

Translating all this to actual languages:

Because I like strictly typed, verbose languages, I've always been
quite fond of the Wirthian languages. After Niklaus Wirth invented
Pascal, he came up with a series of similar languages, that are
while similar, better designed and solve some issues Pascal had.
Unfortunately, as pascal got adopted by the commercial world, it
started evolving on it's own, leading to the creation of object
pascal as used in Delphi/Lazarus today, and Wirth's other
languages never quite saw the adoption they deserve.

Here's all the post-pascal Wirth languages:

Modula, Modula-2, Oberon, Oberon-2, Oberon-07, Active Oberon,
and the relatively new (not by Wirth) Oberon+

,=================================================,
| Language purity versus practicality & enjoyment |
`================================================='

I like all the aformentioned Wirth languages, but quite fond of
Active Oberon especially. However. I do not write as much code in
these languages as I probably should. The reason being, that there
is very infrastructure around these languages.
Compiler availability is an issue. Especially if you're on an
a-typical platform. This makes your code harder to compile, harder
to package, and ultimately less accessible for an end-user. All
those problems have nothing to do with the language itself, but
are more of a direct result of the language popularity, and yet,
they are problems nontheless. As such, there is a trade-off
between practicality, and language aestethics. The Wirthian
languages are simple, nice and clean, mostly because each of these
languages learns from mistakes of the previous one, and rather
than growing the language organically, an entirely new language
was made. At the very opposite of that, you have C++, with a ton
of organic growth. But because it has remained mostly backwards
compatible with itself, and even C, you have a huge ecosystem,
lots of tooling, lots of use, and a terrible ever growing monster
of a language that keeps getting harder for new users to learn
properly. C++ is the opposite of a nice clean lean language. But
it is also a very practical language. You can get a C++ program
(and a C program as well of course) to compile on most platforms,
even very obscure ones, because the tooling is there.

When I bring up C++, someone usually starts going on about Rust.
I'm not a fan, if simply because of the inferred typing,
abbreviated names ('fn' over 'function', really? why?), it's
tooling and ecosystem (cargo is a blight, for the love of all that
is good, write OS packages, and don't use language-specific
package managers or at the very least make them optional in your
build system instead of mandatory).

Anyhow,... all the opinions above quickly lead to language wars
and endless useless debates. They are useless, because at the end
of the day, a programming language is just that. A language. The
tool you write your computing poetry in. As a computing-poet, I
enjoy a good critical look at what pen I'm going to be writing
with, but while some pens are quantifiably better than others,
people don't typically go and berrate poets for picking an
inferior rickity old pen. It doesn't matter what pen the poet
picks, the poet will still make spelling mistakes. ;) What matters
is, how well can the poet express their art. That brings us to
other reasons why I enjoy smalltalk and Oberon.

If you've made it through my litany of language preferences,
you'll observe that smalltalk actually violates a few of those
preferences, and yet I've always enjoyed working with it. While
smalltalk does violate some of my preferences, it is nontheless,
quite a clean consistent language. However the bulk of my
enjoyment came from my interaction with squeak. The idea of having
a sort-of operating system where everything you use, see, mess
with, is implemented in the language you're writing, and
change-able in real-time is a lot of fun! Similarly, if you've
ever tried A2, the oberon bluebottle OS you will have had a
similar experience. Oberon specifically is quite nice. It has
fantastic demo applications, such as an ssh client, a raytracing
graphics demo, an image effects demo, etc,... and a terminal...
the terminal has, not a bash shell (this is not a unix-like OS
after all), but rather a shell where every command is an oberon
module. Each module's source is on the system you're using and can
be modified. So like with smalltalk squeak, you're in an
environment which you can actively modify on the fly. I find these
things really enjoyable. So at the end of the day, I think
enjoyment trumps most things when it comes to languages. Even if
they might not be entirely practical or check all your boxes.

,==========,
| Hardware |
`=========='

One argument I see pop up a lot in arguments between c/c++ and
rust people is that people say C and C++ are popular for a reason,
which is that they are closer to how the hardware actually
functions. While I prefer C++ over Rust, and am not a fan of Rust
in general, this argument is quite wrong of course. Perhaps an
argument could be made that a pointer, referring to an address in
memory, is a thing a computer understands, except for the fact,
unless you're talking directly to the hardware without an OS, that
memory address is not a real physical address at all. It's a
virtual address that the OS has mapped for your program. And
that virtual address space likely has even more abstractions on
top of it as well. And if you've looked at modern x86-64 asm
instructions, you'll see that there is a lot of fun high-level
stuff, including functions for string handling and what not, that
look nothing like C or C++. Heck, even the instructions we feed
a typical cisc CPU these days might not execute verbatim. There's
an entire black-box voodoo layer of cisc magic that rearranges
things, makes assumptions, and does all kinds of opaque stuff we
have little insight into. It is perhaps true however, that the C
and C++ languages are suitable for the ECOSYSTEM that has grown
out of all of this, that is to say, the combination of your
typical cpu black-box logic and operating system. But of course,
it does not have to be that way.
Things get really fun and interesting if you combine your CPU,
OS, and programming language design as one inter-operating smarly
designed ecosystem. There aren't a lot of examples of that. The
obviously famous example would be the Intel iAPX 432, which wasn't
controlled by assembly language instructions, but instead using
high level languages (Wirthian, incidentially ;) ). I'd also
recommend people take a look at caps-based systems described in
the capa book ( https://homes.cs.washington.edu/~levy/capabook/ )
-- A lot of these systems have failed in the past for various
reasons. But it would be interesting to re-visit some of these
concepts with modern hardware, persistent memory (PMEM), etc,...

,================,
| Sustainability |
`================'

I don't think we should just give up and take the current
computing ecosystem as is. While I prefer C++ to interact with the
typical computing ecosystem in popular use, that doesn't mean I'm
blind to it's myriad flaws. If we had a high-level ISA we wouldn't
have to think about or worry about memory addresses at all. As
such, while it may not be practical in the current world, we ought
to dare re-think computing, and re-think what computing might mean
and look like.

This is especially true in the context of sustainable,
perma-computing. With the planet boiling, we probably ought to
re-think how all the industrialized processes involved in modern
computer manufacturing work. Can we do photolithography in a
sustainable eco-friendly way without harmful chemicals? Can we
build a manufacturing supply chain that doesn't rely on exploiting
the third world, and earth's resources? Can we get by with just
some discrete transistor logic computing? All of these are things
we'll have to figure out sooner or later, and within this is also
perhaps an opportunity to re-imagine the computing landscape. --
Buut that might be the topic of an entirely different phlog post!

-jns