(2023-04-16) AwkBeat is now a thing. Now what?
----------------------------------------------
Continuing the ideas of my previous post, I decided to create a helper script
that I could just feed an AWK expression to and get the sound playing. Seems
like a simple task, right? Well, I was really overwhelmed by the amount of
issues that I actually needed to figure out in order for all this to work as
expected. Most of these issues arose because I chose for my script to be
strictly compatible with Busybox while most of the AWK materials on the
Internet are about the GNU version, GAWK. And GAWK really differs from
Busybox AWK even more than Bash differs from Busybox sh. Up to the point of
being able to easily establish bidirectional communication and internet
sockets (yes, unlike Bash, GAWK even allows to listen on them). We really
have to consider the level of compatibility here, so whenever you're writing
an AWK script, I'd give you a strong advice to test it on Busybox first.
Then you'll be sure that it runs on GAWK with no problems as well.
One thing that really got me puzzled for a long time, until I found out that
one can't solve this even on GAWK unless passing the -b flag, is that all
output in AWK, even the one you do with the %c specifier in its printf,
cannot be considered binary-safe. In other words, without the (GAWK-only) -b
flag, characters in AWK are not semantically equal to bytes, and with
non-printable values like \0, as well as codepoints above 127, anything
goes. In case of my system, I could get away with setting LC_ALL=C but this
is not guaranteed to work as expected for everyone. This is why, just for
the sake of not having my formula output mangled regardless of system
locale, I had to limit the AWK part to only printing the hexadecimal stream
to the stdout, and then pipe it into the xxd -r -p command that would
reconstruct the binary stream before passing it into the player command with
another pipe. Well, having to use xxd didn't bother me much as it also is a
Busybox applet, just like awk command itself. Besides, xxd itself is an
extremely useful tool even in the Busybox variant, so I recommend everyone
to learn it too. By the way, in case you didn't know, xxd -g1 is almost
identical to the default hexdump -C, except there is a colon after the
offset and no extra space in the middle of hex lines. Since I don't really
use hexdump in any other mode, maybe it's time to finally fully migrate to
xxd. Note that some systems with limited Busybox/Toybox builds don't even
include xxd. Although they may *usually* include hd (which is a poor man's
equivalent of hexdump -C), so that's already something. But, in order to be
able to reconstruct binary streams from hex or anything without xxd, you
might have to use much slower options like another shell script with read
and printf.
Another thing is, of course, PCM sound output itself, i.e. the commands we
pipe our binary output to. For the SoX play command in the previous post,
there actually is a shorter equivalent (provided your desired sample rate is
in the SAMPLERATE environment variable):
play -q -tu8 -r${SAMPLERATE} -c1 -
Now, if you don't have SoX and don't want to install it for some reason, I
found some easy ways to pipe the output to some popular sound subsystems in
GNU/Linux environment:
I commented out these alternatives in the script (the definition of the
PLAYCMD variable). If you need them, just uncomment the corresponding line.
Oh, and the script itself, awkbeat.sh, already can be found in the
downloadables section on hoi.st. Feel free to try it out. So, this
initiative is finally complete. Now what?
Now, starting from here, I can pursue two goals: porting as many Bytebeat
expressions as possible to (Busybox-compatible) AWK (maybe some cool
formulas collection will also appear on my page) and also learning AWK to
make the most of the ecosystem that Busybox itself provides. Not sure if
there is any point in targeting even more limited environments at the
moment, like Android 6 that has no AWK but has a Toybox binary with nc and
sed... Oh well, that's totally another story.