Switching to Python
Mon, 08 Jan 2024
Technology, Opinion
===================

For the longest time (really since I started getting remotely
serious about writing my own software, my language of choice has
been POSIX shell. The shell is the environment through which one
interacts with a unix-like operating system, and shell-script are
simply combinations of interactions which combine to achieve
pretty monumental things.

I always knew that shell scripting had its limitations. Though it
was mostly fine for my purposes. After some recent issues with
the shell however, I have switched away from it. My scripting
language of preference is now python, though the transition is
strange, as the two languages are designed for entirely different
purposes. This post will detail the reasons I switched away from
shell as well as the difficulties I have had switching to python.

-----------------------------------------------------------------

The primary reason I moved away from shell scripting now, as
opposed to earlier is that I cannot reliably run shell scripts on
my current system. As I have mentioned in several previous posts
now, my main computer interface is currently an iPad app called
a-shell. a-shell uses dash for its shell processing, though the
port is extremely buggy and lacking in some fairly basic
features[1]. This is what finally propelled me away from shell
scripting, but there were many problems I was already
encountering beforehand, so let's explore these too.

1. Standardisation
==================

POSIX is a set of standards for various pieces of software which
includes standards of how a POSIX-compliant shell should operate.
One major problem with these standards is that they are poorly
adhered to. The GNU project is particularly guilty of this, in
that it extends POSIX software with many non-POSIX features.
While these features are sometimes great, I cannot properly make
use of them because I do not know if a system I am targeting is
using the GNU-extended utilities or the base-POSIX ones.

This opens up to a wider point that I cannot write a shell script
and expect it to perform reliably on any given system. Minor
differences in the way a shell is implemented can cause an entire
script to fail or output wrong data. I actually had this problem
recently, where my shell wanted newline characters to be written
as follows '\\n' whereas the system I was targeting wanted '\n'.
This meant I could not test the software locally before pushing
it to the production server, which was of course highly
frustrating and time-consuming. A more common case is the use of
`echo`. Echo has various switches in various shell
implementations. Sometimes echo will interpret escape sequences,
sometimes it will just print them literally, sometimes the -n
switch makes it so echo does not terminate in a newline,
sometimes it just prints the string '-n'. Echo is virtually
unusable in shell scripting targeting multiple systems for this
exact reason and printf is often recommended instead.

2. Intended use
===============

Why does echo work in such varying ways? Well because the shell
was never intended for programming. It was intended for
interacting with your system. This is also why bare-POSIX shells
lack the functionality real programming languages have. At its
core, the shell is meant to interact with other programs on the
system, such as sed, grep, awk, cat, etcetera. GNU's bash shell
makes an effort to push the shell closer towards being a proper
programming language by adding things like reading files and
processing substring directly in the shell without needing to
rely on cat and cut respectively.

3. External programs
====================

What about these programs the shell is meant to interact with
then? The way I see it, they are the greatest blessing and curse
of shell scripting. Basically, it is incredibly easy to call
upon a program installed on the system and equally easy to make
it interact with other programs. However, if you are targeting
multiple systems, you do not know exactly which programs are
installed, and which versions. Here we once again encounter the
problem of GNU-extended software and undefined behaviour.

Furthermore, because the shell can basically only do basic string
manipulations, your programs will likely heavily rely on calling
external programs (especially the core utilities) to achieve
anything noteworthy. Your shell script will act more as glue to
hold the various 'real' programs together. With this being the
case however, if there is no program to perform the task you
need, you are basically out of luck. This once again goes to
show, that shell is not (and does not intoned to be) a
programming language.

-----------------------------------------------------------------

If these problems were already always present in the shell
however, then why did I choose to use it regardless? Well the
answer is simple: I needed a lot of glue. I often found myself in
situations where I had data of one format (usually given by a
program) and I needed that same data in a different format for
use in another program or for my own consumption. This is exactly
what shell script excel at. I would still use shell script for
this purpose if the dash port on a-shell was more stable.

Now I am using python for translating data in this manner, but
this is often much more complicated. Getting input data from a
command is doable in python, but nowhere near as easy as it is in
shell scripts (where you literally JUST write the command you
want the output from). Outputting data is also more difficult. In
shell, one can use pipes or redirects to pass data to a command
or file respectively, in python, there are several steps involved
in either of these processes, in addition to the need of
libraries.

-----------------------------------------------------------------

The static site generator for this blog is now a little python
script rather than a shell script. Honestly the shell script
started out incredibly simple, relying really only on the
standard utility `head`. Things quickly grew complicated though,
with me using several dozen regex strings in sed to get both
html, gopher, and rss output working. While version 0.1 of my
site builder was mush more simple in shell, the monster it grew
into was much easier to understand in python, where it was also
more flexible. As such, translating the script to python allowed
me to make some quality of life changes, and it will also make
future development easier.

Most importantly though. The python implementation is stable, and
works in exactly the same way on any system running the same
version. If you wish to see my current implementation, it is
available on my GitHub page at
user18130814200115-2/plain.wester.digital.

[1] I do not, by any means, mean to accuse the author of a-shell
of doing a bad job. A shell is an incredibly complex piece of
software, and the fact that the shell works even this well is
nothing short of impressive.