I have been spending an almost unacceptable amount of time
coding lately. After working on FALSE in C, I felt like I
picked up a bit of C and got the workflows down ok. But I
felt like I was fighting a losing battle against memory
leaks, double frees, etc. So I did the logical thing: I
started on a different programming language project. This
time: nimf. nimf was the first non-dsl language I ever
wrote. I wrote it in golang. I quickly abandoned it and
wrote slope instead (which is by far my most usable lang).
nimf is a stack language (concatenative) loosely in the
vein of forth, though perhaps with some concepts either
simplified or more naive depending on how you look at it.
This time around, doing it in C, I initialize a fairly
large block of memory at the beginning:
int memory[6000] = { 0 };
or the like. I don't know from my own memory if 6k is the
actual number. It could always be changed as it is defined
as a macro somewhere. But that memory space gets used to
hold the various state flags (the whole thing runs as a
state machine with a stack), two stacks, and the program
memory. Strings in nimf are 1 indexed and stored with the
length followed by values (`{ 5 'h 'e 'l 'l 'o }`). The
system only knows integers and strings (as described). So
most of the allocations and frees are for various type
conversions, taking input, file handling, and so-on...but
not with nimf strings themselves (under most conditions).
I use a linked list for the dictionary, which stores vars
and words alike (a var is essentially a word that has a
memory address as its only content).
Some things I built into the system:
1. Local variables. Any memory that nimf allocates (from
its block of memory) inside a word is freed up when the
word finishes execution. As a result, all variables are
local by default. To use a global variable you must
declare it in a higher scope. This includes words (proc
definitions): they also get cleaned up from the linked
list when their scope exits.
2. Access, to a certain degree, to syscalls. As a result,
chdir, chmod, cwd, mkdir, rmdir, and friends are all
available by coding in nimf. This is both good and bad
but works surprisingly well (on x86_64, i386, arm, and
aarch64...I did not build in support for others, but it
would be simple to add).
3. Files can be created, opened, read, and written to. So
can tcp sockets (I have read some gopher this way).
4. Loops and Conditionals are closer to a jsr (jump set
return) instruction than regular conditionals/loops. As
such you write things like:
x y = elif truthything falsything
Mostly meaning that they take a word, rather than an
arbitrarily sized block of code. You define the word
first and then give it to the conditional/loop. This is
way less elegant in many ways, but it made parsing so
much simpler it feels worth it. You just have to get
used to it a bit and it isn't too bad. Plus, as stated
in 1 (above) the words you create to complete these all
get cleaned up automatically, so you can mostly think of
them as a block scope that you write first and pass to
your `if`, `while`, `do`, `count`, or `elif`.
5. You can use the `inline` word to grab other files and
run it in the current file. I have not yet worked out
a way to load stdlib stuff without knowing the path of
the stdlib and manually typing the whole thing. I am
sure I'll create a search path for files eventually, or
maybe it isn't really needed. I dunno. If I was using
c23 I could apparently use the `#embed` preprocessor
feature to load the file as a string into the resulting
program and then just parse it as needed to import the
standard libs I have been building... but that feels
like it created a bigger slower thing needlessly, and
I can just put them somewhere on the drive and have a
shorthand to let the interpreter know to look there
instead of a file local to the current file.
Anyway, it has been fun. I have had nearly no issues given
to me by valgrind. Memory is solid. So that is much better
than for my FALSE implementation (a much simpler lang). I
will take that to mean that I am improving not just at
using C, but _thinking_ C and structuring things more
appropriately for the tool at hand. So far the code has
been fast, fun to work with, and a constant puzzle to solve.
I have only gotten really frustrated once: I was trying to
make a `mkdir` syscall to test using syscalls. It kept
coming out with bonkers file modes, and not what I passed.
I kept thinking it was an issue with the umask. After a
few hours I noticed that on the C side I was supposed to
pass the syscall two different vars (the filepath, which
was being passed correctly, and the mode as an int). I was
passing the same var twice. As a result, the int was really
a pointer and was being treated as an int and just giving
me garbage results. Once I fixed the var `kind1` to `kind2`
all was happy.
I'm sure none of this is very interesting to anyone, but
like the lang itself: I have enjoyed writing it. I will
write about non-code stuff next time. Until next time, be
well gopherspace.