(2023-04-15) On sound in the world of minimalist computing
----------------------------------------------------------
Yes, I know, I know. I promised to continue the stream compression topic.
Well, I'm in progress of writing some C code to support my further research
(if I can even call it so) in this area. And the results are to follow
whenever this is complete. No rush, but I'm not going to abandon this in the
middle, I've already written enough code to not just throw it away.

Today, however, I want to talk about computer audio in general and computer
music in particular. While I believe in the predominant role of fully
analogue media to store sound and music in case our computers become small
and weak and truly LPC devices, there still is some space for possibilities
and exploration. Because as we know, the greatest hits among those fully
created on computers were created on, or at least for, old 8-bit machines
like Famicom and C64. The beauty of chiptunes was in pushing the capability
limits of the soundchips of the time. While the soundchip essentially
defined the basics of how you design your sounds, giving you a very limited
amount of PSG channels, each with its own restrictions, you could unleash
your creativity not only with the music itself but also with your own
techniques to bypass these restrictions to not sound like everyone else.
This is why, for instance, when I hear a music from some Famicom/NES game, I
can tell for sure if it's Sunsoft, Natsume, HAL or Konami, even if I hear
this music or see the game for the first time in my life. Because you can't
confuse their sound engines, that's how original the sound is that they
produce on the same five standard PSG channels.

But then, not every machine of the time even had a soundchip or any sound
output capabilities at all, except a simple piezo buzzer or a single
frequency speaker. Aha! What if we turn this buzzer on and off fast enough
to simulate any frequency we want? This is how the PFM (pulse frequency
modulation) technique was invented, now often referred to as "1-bit music".
The "1-bit" here merely refers to the fact that we operate on the output
device that can only be in one of the two states, on or off. In reality this
also meant that everything that soundchips did in hardware, here had to be
done in software. That's why all PFM music authors also had their own
speaker drivers bundled within the album, the game or whatever their music
was created for.

Starting with Amiga computers and IBM PCs with external sound cards and to
this very day, all audio is now generally output using the technique first
introduced for Audio CDs, PCM (pulse-code modulation), where the signal is
quantized to some finite amount of levels and sampled at some rate per
second. The bigger the rate and the amount of levels are, the better the
sound quality, but the more processing power is required too. Nowadays,
sound generation is fully abstracted from the hardware layer and composers
don't have to adapt to the chips of the machines they work with anymore.
They just output sound in some format that can be represented as PCM data at
the end of the day, and the system then takes care of the rest. Most of them
don't even create music and effects in the form of pure PCM data - it is
either decompressed from a more compact source (like MP3, OGG or FLAC) or
synthesized (in which case the composer only deals with the notes,
instruments and effects that a particular DAW or tracker software can offer,
not to mention even higher layers of abstraction like Web Audio API). So,
even the pure PCM, despite its ubiquity, isn't something that everyone
touches directly these days.

It is, however, still there. And every modern OS in existence offers a more
or less straightforward way to output raw PCM data directly to the audio
device (if you can't find a more straightforward way, you can always use
less straightforward ways like the aforementioned Web Audio API, at the cost
of higher resource usage, of course). A special case can be noticed when
your signal is quantized to 256 levels and every level is represented with a
single integer from 0 to 255 (or from -128 to 127, depending on how you look
at it, but usually it's treated as unsigned), that is, a byte. Combined with
the default PSTN-compatible sampling rate for most sound adapters, that is,
8000 KHz, we get the default "raw" PCM mode, which is unsigned 8 bit 8KHz
PCM, that requires no additional preconfiguration of the adapter to emit
sound, at least in Linux-based OSes. Which is why, when OSS was a major
thing and the audio device was represented with a single file in /dev/
(/dev/audio or /dev/dsp), people had fun by redirecting the contents of
various (non-sound) files directly into this file and listening to what came
out as the result. With ALSA, PulseAudio or whatever else abstraction layer
you have, you still can pipe the output to something like properly
parameterized sox-play to achieve the same effect:

cat somefile | play -traw -r8000 -b8 -e unsigned-integer -

Some files gave interesting sound patterns, others just gave noise, so
experienced listeners could even guess the file type by what they heard. But
then, some people started thinking a step forward: "What if, instead of just
catting existing files, we generate the raw sound data programmatically?"
Among those people was the Finnish guy I already mentioned on this phlog,
viznut. He had compiled a bunch of short C programs where he defined every
sound output byte as a function of a single eternally incrementing integer
variable t, and the standard output of these programs was the unsigned 8-bit
8KHz PCM data to be redirected to the /dev/audio, aplay or sox-play. He
called this concept "Bytebeat". And, if you read my previous explanation
carefully, you can see where this name came from.

Originally, Bytebeat gained popularity in this very form, but then, as more
bytebeat players got ported to different environments including
browser-based ones (with Web Audio API), it wasn't limited anymore to
outputting single bytes or even integers (so, the so-called "floatbeat"
spawned as well), it wasn't limited to only using the bitwise and basic
arithmetic operators (because JS has trigonimetry and all other stuff in its
standard library already) and it wasn't limited to just output samples at
8KHz. On one hand, the removal of this limitations allowed people to
transcend to other spaces of exploration, on the other hand, the same people
started abusing JS's capabilities and built entire trackers inside the
bytebeat expressions, returning to the traditional and not generative music,
which kinda defeated the whole initial purpose and the idea to find music in
purely mathematical expressions, preferably as short as possible.

I understand why the general public went the WAAPI route. Browsers are easy,
they allow for quick exploration and on-the-fly change of sound, not even
having to compile the expression every time you change it. Less getting in
the way between you and the formula, more possibilities for customization
and general convenience. However, using a whole friggin' browser for this
task is as far from the KISS way as possible. I'd rather use pure shell
scripting but I know the bitwise operations in Bash are a bitch, it's a
command language, not a general-purpose programming language, after all. So,
we need something no less ubiquitous that would ideally be present even in
Busybox but would save us from all the hassle with compilation.

Enter AWK. Yes, it's a full-featured programming language and yes, it's
present in Busybox, although probably not as powerful as GAWK. I don't have
any good knowledge of AWK right now, I probably need to learn it at least as
well as Bash and start using it on a daily basis. But, you know what, this
example worked on my Arch:

seq 11111111 | busybox awk '{printf("%c",and($1,rshift($1,8)))}' | play -traw
-r8000 -b8 -e unsigned-integer -

Which means that yes, we do have bitwise operators in Busybox AWK and we can
port any classic C bytebeat formula to this language. I'll definitely
experiment with this more and probably will create a separate document
linked from the main hoi.st map that contains ports of some music formulas I
liked or created myself. A synthesizer and a tracker, all in a tiny formula,
now in AWK.

If AwkBeat isn't yet a thing, now is the time.

--- Luxferre ---