The basics of an indirect threaded code ANS Forth implementation

* * * * *

The basics of an indirect threaded code ANS Forth implementation

I need to get into the habit of writing prose more often.

Before I go into the implementation of my ANS Forth system, I need to define
somethings. First, a bit about Forth terminology—a Forth “word” can be
thought of as a function or subroutine of other languages, but it's a bit
more than that—it can also refer to a variable, a constant, the equivalent of
a key word. It's a fluid concept, but thinking of a “word” as a function
won't be too far off. Also, a collection of Forth words is collected into a
wordset (or if you are reading older documentation about Forth, a
“dictionary”). You can think of a “dictionary” as a library, or maybe a
module, of code.

Second, Forth is a stack-based language. There are rarely any explicit
parameters, just data placed on a stack, known as the “data stack.” Because
of this, expressions are written in Reverse Polish Notation (RPN (Reverse
Polish Notation))—data is specified first, then the operator.

Third, there's a second stack, the “return stack” that is pretty much what it
sounds like—it records the return address when calling into a Forth word.

Forth, the langauge is ridiculously easy to parse—a token is just a
collection of characters separated by space. So not only is my_variable_name
a valid name for a Forth word, so is my-variable-name, #array, 23-skidoo and
even 3.14 valid names for Forth words. Yes, that means it's easy to do stupid
things like define the token “1” to be 2 (you know, for when 1 equals two for
large values of 1). But this also makes Forth trivial to parse. In fact, a
Forth interpreter is nothing more than:

-----[ Pseudocode ]-----
begin
name = parse_token()
word = lookup(name)
if (word)
execute(word)
else
{
if (valid_number(name,current_base))
push_number(convert(name,current_base))
else
error()
}
repeat
-----[ END OF LINE ]-----

That's it. That's a Forth interpreter more or less. Compiling Forth isn't
much harder:

-----[ Pseudocode ]-----
begin
name = parse_token()
word = lookup(name)
if (word)
{
if (immediate(word))
execute(word)
else
compile_into_definition(word)
}
else
{
if (valid_number(name,current_base))
add_code_to_push(convert(name,current_base))
else
error()
}
repeat
-----[ END OF LINE ]-----

A Forth word can be marked as “immediate,” which just means the word is
executed during compilation mode rather than being compiled, and this is how
the Forth compiler can be extended. A Forth word like IF or BEGIN is just
another Forth word, albeit marked as “immediate” so it can do its job when
compiling.

The details about switching between interpreting and compiling will be
covered in a later post, but it's not a difficult as it may seem.

And one bit about ANS Forth in particular—the standard defines a collection
of “wordsets [1],” most of which are optional in an implementation. The
minimum required is the Core Word set [2]. The rest, including the Core
Extension words, are optional. I did not implement all the wordsets, but
again, more on that later.

Anyway, my ANS Forth system [3] is a classic indirect threaded code (ITC
(Indirect Threaded Coded)) implementation. As I mentioned [4], it's easy to
implement but perhaps not the fastest of implementation styles. As this is on
the MC6809 [5], I used a typical-for-the-6809 register use for my Forth
system:

Register Type Forth usage
------------------------------
D 16-bit accumulator Free for use (top of stack is kept in memory)
X 16-bit index register execution token of word being run, free for use
Y 16-bit index register Forth IP (Instruction Pointer) register
U 16-bit index register/user stack data stack pointer
S 16-bit index register/system stack return stack pointer

I elected to keep the top of stack in memory and not in the D register. I'm
not sure if this effects the speed any, but it was easier, implementation
wise, to keep the top of stack always in memory.

The Forth words in my implementation are either primitives, that is, written
in assembly language, or secondaries, that is, written in Forth. Here's the
format of a primitive word, +:

-----[ Assembly ]-----
forth_core_plus ; n1|u2 n2|u2 -- n3|u3 )
fdb forth_core_star_slash_mod ; previous word in dictionary
fdb .xt - .name ; length of name
name fcc "+" ; name
xt fdb .body ; execution token of word
body ldd ,u++ ; body of word
addd ,u
std ,u
ldx ,y++
jmp [,x]
-----[ END OF LINE ]-----

The first thing to notice is the label. Given how difficult naming is [6] in
Computer Science, I decided to use the canonical name of each Forth Word as
defined by the standard [7]. All the labels start with “forth_,” and are
followed by the wordset (in this case, “core_”) followed by the actual Forth
word. Yes, this makes for some long labels, but I don't have to think about
naming, which is nice. The .xt label (which is a “local label”—this defines
the full label of forth_core_plus.xt) defines the “execution token” (xt)
which is defined as “a value that identifies the execution semantics of a
definition.” Here, this is a pointer to a function that provides the
execution semantics of the word and in this case, just points to the “body”
of the Forth word, which adds the top two elements of the data stack.

Of particular note are the last two instructions:

-----[ Assembly ]-----
ldx ,y++
jmp [,x]
-----[ END OF LINE ]-----

Get used to seeing this fragment, it's used a lot—this is used to execute the
next word in the program. As stated above, the Y register is the Forth IP,
and this loads the X register with the xt of the next word, then jumps to the
code that handles this word. The [,X] bit informs the CPU (Central Processing
Unit) that we are jumping through a function pointer and not directly to the
code (if we did that, JMP ,X, that would turn this from “indirect threaded
code” to “direct threaded code”—and yes, that is the only difference between
an ITC and DTC (Direct Threaded Code) implementation).

The overall format of a word is a link to the previous word in the dictionary
(here to */MOD), followed by a 16-bit length, the text that makes up the
word, followed by the xt and then the body of the definition. You might be
wondering why I would use a full 16-bits for the length on an 8-bit CPU—
wouldn't that waste space? Yes, it would, especially given that in this
implementation, a Forth word is restricted to just 31 characters, but I need
a way to mark some information about each word, like if it's an immediate
word or not. And while an 8-bit length where the largest value would be 31
giving me three bits for flag values, I ended up needed a few more than just
three. So 16 bits for the length gives me a potential of 11 bits to use as
flags.

A word written in Forth will have a different xt and body, for example, the
very next word in the dictionary:

-----[ Assembly ]-----
forth_core_plus_store ; ( n|u a-addr -- )
fdb forth_core_plus
fdb .xt - .name
name fcc "+!"
xt fdb forth_core_colon.runtime
;===============================
; : +! DUP @ ROT + SWAP ! ;
;===============================
fdb forth_core_dupe.xt
fdb forth_core_fetch.xt
fdb forth_core_rote.xt
fdb forth_core_plus.xt
fdb forth_core_swap.xt
fdb forth_core_store.xt
fdb forth_core_exit.xt
-----[ END OF LINE ]-----

The xt here points to the label forth_core_colon.runtime. This is usually
called DOCOLON but I'm being explicit here—Forth words defined in Forth are
created by the Forth word :, and this label, forth_core_colon.runtime,
implements the runtime portion of said word. The body of this word is then an
array of execution tokens of various Forth words comprising the definition.
The last word of all Forth words defined this way end with a call to
corth_core_exit.xt.

The : runtime function looks like:

-----[ Assembly ]-----
forth_core_colon ; ( C: "<spaces>name" -- colon-sys ) E ( i*x -- j*x )
fdb forth_core_two_swap
fdb .xt - .name
name fcc ":"
xt fdb .body
body ... ; I'll get to this bit of the code in a later post

runtime pshs y
leay 2,x
ldx ,y++
jmp [,x]
-----[ END OF LINE ]-----

Here, the runtime will push the Forth IP (which is the Y register) onto the
return stack, set the Y register to the body of the word being executed and
that two instruction sequence that goes to the next word to execute.

And the function EXIT looks like:

-----[ Assembly ]-----
forth_core_exit ; E ( -- ) ( R: nest-sys -- )
fdb forth_core_execute
fdb _NOINTERP :: .xt - .name
name fcc "EXIT"
xt fdb .body
body puls y ; restore the Forth IP
ldx ,y++ ; and execute next word
jmp [,x]
-----[ END OF LINE ]-----

The first thing to notice here is the _NOINTERP flag. EXIT [8] is defined as
having no interpretation semantics, so typing EXIT outside of a word being
defined is meaningless. Yes, this flag is used, and it does generate an
error, but that again, is a later post. I should also mention that the ::
here is a special operator in my assembler [9]. The left hand side defines
the most significant byte of a 16-bit quantity, and the right hand side
defines the least significant byte. It's short hand for _NOINTERP * 256 +
(.xt - .name).

The second thing to notice is that the Forth IP (again, the Y register) is
pulled from the return stack, and we yet again, run that two instruction
sequence to run the next word.

And this is pretty much the entire execution engine of Forth.

In fact, I wrote this bit first, even before writing code to compile Forth
(and yes, I hand-compiled all Forth code, so I had something to compare
against when I eventually got to compiling).

So on the one hand, Forth is an easy language to implement and can be quite
small. On the other hand, ANS Forth has some subtle semantics that make for
some … interesting implementation details and isn't that easy to implement,
as we shall see over the coming posts.

[1] https://forth-standard.org/standard/words
[2] https://forth-standard.org/standard/core
[3] https://github.com/spc476/ANS-Forth
[4] gopher://gopher.conman.org/0Phlog:2025/06/02.1
[5] https://en.wikipedia.org/wiki/Motorola_6809
[6] https://martinfowler.com/bliki/TwoHardThings.html
[7] https://forth-standard.org/standard/alpha
[8] https://forth-standard.org/standard/core/EXIT
[9] https://news.ycombinator.com/item?id=43815511
---

Discussions about this page

The basics of an indirect threaded code ANS Forth implementation | Hacker News
https://news.ycombinator.com/item?id=44188097

Lazy Reading for 2025/06/08 – DragonFly BSD Digest
https://www.dragonflydigest.com/2025/06/08/lazy-reading-for-2025-06-08/

Email author at [email protected]