Implementing DOES> in Forth, the entire reason I started this mess

* * * * *

Implementing DOES> in Forth, the entire reason I started this mess

The issue [1] I had with DOES> isn't that it's hard to use—it's just that I
had no idea how one would go about implementing it, much like Javascript
programmers use closures without having to think about how they're
implemented (even if they're aware of closures in the first place). So,
before going into how it works, a sample from _Starting Forth_ [2] is in
order.

-----[ Forth ]-----
: STAR 42 EMIT ;

: .ROW CR 8 0 DO
DUP 128 AND IF STAR ELSE SPACE THEN 2*
LOOP DROP ;

: SHAPE CREATE 8 0 DO C, LOOP
DOES> DUP 7 + DO I C@ .ROW -1 +LOOP CR ;

HEX 18 18 3C 5A 99 24 24 24 SHAPE MAN
-----[ END OF LINE ]-----

These two words support the example. The first word, STAR just prints a
asterisk (42 is the ASCII code for the word). The second word, .ROW, takes an
8-bit value and for each bit, if it's a 1, prints an asterisk, otherwise, it
prints a space. DO LOOP is Forth's for loop by the way. The next word, SHAPE
is the interesting one. But first, we need to discuss CREATE.

This word creates a new entry in the Forth dictionary by reading the next
word (defined as a collection of non-space letters) in the input as the name.
It then gives the newly created word a default action of pushing the address
of the body of the word into the stack. Going ahead a bit, the word MAN just
after CREATE is run will look like this (in assembly):

-----[ Assembly ]-----
man fdb shape ; link to next word
fdb .xt - .name
name fcc 'man'
xt fdb forth_core_create.runtime
body
-----[ END OF LINE ]-----

When MAN is run, the address of .body will be pushed onto the stack. CREATE
is typically used to create “smart data structures”—data structures that know
how to do some action.

Now, getting back to the example, when SHAPE is run, the first thing it does
is call CREATE to create a new word, then it compiles 8 values off the top of
the stack into the body of the newly created word. Just prior to DOES>, MAN
will look like:

-----[ Assembly ]-----
man fdb shape ; link to next word
fdb .xt - .name
name fcc 'man'
xt fdb forth_core_create.runtime
body fcb $24
fcb $24
fcb $24
fcb $99
fcb $5A
fcb $3C
fcb $18
fcb $18
-----[ END OF LINE ]-----

Now we get to DOES>. Due to the nature of what it does, DOES> is an immediate
word—that is, its executing during compilation to do the voodoo that it do.
Um, does. Somehow, it needs to modify the newly created word to not only push
the address of its body onto the stack, but execute the code that appears
after itself. So the code to be executed needs to be compiled and stored
somewhere, and somehow MAN (in this example) needs to run this code.

And this was the problem I had with the word—how does this all work? Even the
well known JonesForth [3], implemented as an ITC (Indirect Threaded Code),
didn't bother with implementing DOES> [4] (and now that I have implemented
DOES>, I suspect I know why JonesForth didn't implement it).

The runtime portion of CREATE just pushes the address of the body of the word
into the stack. The data bytes following the xt have no meaning in and of
themselves (even as code it's nonsensical). I did a search and found only one
page [5] that describes how to implement DOES>, but:

1. it was part three of a series of articles describing how Forth's are
implemented;
2. using terminology no longer used by the ANS Forth standard;
3. attempting to describe how to implement Forth on several different CPU
(Central Processing Unit) architectures;
4. using a few different methods (like ITC, DTC (Direct Threaded Code) and
STC (Subroutine Threaded Code));
5. and on this page, a wierd side trip through another Forth word ;CODE.

It wasn't an exactly easy source to read, but between part three and part one
[6], I was able to puzzle it out (and it makes much more sense now that I've
done it). Now I can discribe the result using a single architecture (6809)
and a single implementation (ITC). The trick here is to realize that DOES>
has a temporal aspect unlike any other Forth word.

Most immediate words in Forth have two temporal aspects—at the time of
compilation, and later at runtime. For instance, IF's compile time aspect is
to compile a conditional jump into the word, and the runtime aspect is to do
said conditional jump (at least, it does so in my implementation). But DOES>
has three temporal aspects:

-----[ Forth ]-----
: SHAPE CREATE ...a DOES> ( time 1 ) ...b ;
..c SHAPE MAN (time 2 )
MAN (time 3 )
-----[ END OF LINE ]-----

At time 1, we are compiling a word that creates other words (so at this
point, CREATE is compiled, not run). The compiler looks up DOES>, notices
that it's an immediate word and executes it. DOES> at this point needs to
include code to cause SHAPE to stop executing, then somehow leave … something
… behind for time 2, and somehow compile the rest of the code ...b for later
execution.

At time 2, we're defining a new word. CREATE has been called and the
initialization code for this new word …a has been executed. At this point,
DOES> needs to modify the new word … somehow … to execute the code that
followed it at time 1.

And at time 3, the word created is run and somehow, it needs to know where
the code to run is located. But going back to what CREATE and the
inialization code left us:

-----[ Assembly ]-----
man fdb shape ; link to next word
fdb .xt - .name
name fcc 'man'
xt fdb forth_core_create.runtime
body fcb $24
fcb $24
fcb $24
fcb $99
fcb $5A
fcb $3C
fcb $18
fcb $18
-----[ END OF LINE ]-----

What can be done?

The easy answer—DOES> updates the xt of the newly created word at time 2.
Where is this xt created? At time 1. And when is it uses? At time 3.

Here's what happens.

DOES> is an immediate word. When it runs at time 1, it compiles into the
current word (in this example, SHAPE) the xt of its runtime. So SHAPE will
look like this:

-----[ Assembly ]-----
shape fdb dot_row ; link to next word
fdb .xt - .name
name fcc 'shape'
xt fdb forth_core_colon.runtime
fdb forth_core_create.xt
fdb forth_core_literal.runtime_xt
fdb 8
fdb forth_core_literal.runtime_xt
fdb 0
fdb forth_core_do.runtime_xt
L1 fdb forth_core_literal.runtime_xt
fdb 128
fdb forth_core_and.xt
fdb forth_core_if.runtime_xt
fdb .L2
fdb dot_row.xt
fdb forth_core_ext_again.runtime_xt
fdb .L3
L2 fdb forth_core_space.xt
L3 fdb forth_core_two_star.xt
fdb forth_core_loop.runtime_xt
fdb .L1
fdb forth_core_drop.xt
fdb forth_core_does.runtime_xt
-----[ END OF LINE ]-----

(Note: here you can see that literal numbers have the LITERAL runtime action,
that IF compiles to its runtime action. There are two Forth words that pretty
much do the same thing—AHEAD does an unconditional branch forward, and AGAIN
does an unconditional branch backwards; they basically both do an
unconditional branch, so I picked one to handle both internally and I picked
AGAIN for this. More on this in a later post.)

To create the new xt that words created by SHAPE will use (or any word that
includes DOES>) it then lays out a single instruction, JSR
forth_core_create.does_hook (more on this in a bit). It then exits, keeping
the compiler “on” so the rest of the code that follows DOES> gets compiled
into the word (SHAPE in this case). This is all DOES> does (man, that sounds
weird) at time 1. At the end, SHAPE looks like:

-----[ Assembly ]-----
shape fdb dot_row ; link to next word
fdb .xt - .name
name fcc 'shape'
xt fdb forth_core_colon.runtime
fdb forth_core_create.xt
fdb forth_core_literal.runtime_xt
fdb 8
fdb forth_core_literal.runtime_xt
fdb 0
fdb forth_core_do.runtime_xt
L1 fdb forth_core_literal.runtime_xt
fdb 128
fdb forth_core_and.xt
fdb forth_core_if.runtime_xt
fdb .L2
fdb dot_row.xt
fdb forth_core_ext_again.runtime_xt
fdb .L3
L2 fdb forth_core_space.xt
L3 fdb forth_core_two_star.xt
fdb forth_core_loop.runtime_xt
fdb .L1
fdb forth_core_drop.xt
fdb forth_core_does.runtime_xt

does jsr forth_core_create.does_hook ; !!!

fdb forth_core_dupe.xt
fdb forth_core_literal.runtime_xt
fdb 7
fdb forth_core_plus.xt
fdb forth_core_do.runtime_xt
L4 fdb forth_core_i.xt
fdb forth_core_c_fetch.xt
fdb dot_row.xt
fdb forth_core_literal.runtime_xt
fdb -1
fdb forth_core_ext_plus_loop.runtime_xt
fdb .L4
fdb forth_core_c_r.xt
fdb forth_core_exit.xt
-----[ END OF LINE ]-----

Now we execute SHAPE. Things go along until we get to
forth_core_does.runtime_xt. At this point, the Y register is pointing to the
JSR forth_core_create.does_hook (see the previous installment for why this is
[7]—but to recap: the Y register is the Forth IP (Instruction Pointer)). We
get the xt of the newly created word (and yes, I had to modify CREATE to
stash this for later use) to replace the default xt. At this point, MAN now
looks like:

-----[ Assembly ]-----
man fdb shape ; link to next word
fdb .xt - .name
name fcc 'man'
xt fdb shape.does
body fcb $24
fcb $24
fcb $24
fcb $99
fcb $5A
fcb $3C
fcb $18
fcb $18
-----[ END OF LINE ]-----

Then the DOES> runtime basically does a Forth return, ending the execution of
SHAPE. Thus ends the steps that happen at time 2.

When MAN executes, it executes JSR forth_core_create.does_hook. This is a
small extension to forth_core_create that does the double duty of pushing the
address of the body onto the stack, and setting things up to run the Forth
code compiled just after that instruction:

-----[ Assembly ]-----
forth_core_create
fdb forth_core_c_r
fdb .xt - .name
name fcc "CREATE"
xt fdb .body
body ... ; not important right now

does_hook puls d ; pull return address of the stack
pshs y ; push Forth IP onto return stack
tfr d,y ; point to DOES> code
runtime leax 2,x ; get body from xt
pshu x ; push into the stack
ldx ,y++ ; NEXT
jmp [,x]
-----[ END OF LINE ]-----

The forth_core_create.does_hook pulls the return address (from the JSR
instruction) from the stack—this contains the Forth code after DOES> that
needs to run. We then push the existing Y register onto the stack, then set Y
to the Forth code to execute. This leads right into
forth_core_create.runtime, which pushes the body of the word (in this case,
MAN) onto the stack, and then jumps into the code following the DOES>.

And at the end of all this, you get:

-----[ Forth ]-----
MAN
**
**
****
* ** *
* ** *
* *
* *
* *
OK
-----[ END OF LINE ]-----

I suspect the reason why JonesForth didn't implement DOES> has to do with the
direct subroutine call in the middle of a Forth word. This only works if
memory is both writable and exectuable, and modern systems tend to disallow
that. There might be a way around this, but I haven't yet bothered to figure
it out. I'm just happy to have figured it out as it is.

[1] gopher://gopher.conman.org/0Phlog:2025/06/02.1
[2] https://www.forth.com/starting-forth/11-forth-compiler-defining-words/
[3] http://git.annexia.org/?p=jonesforth.git;a=summary
[4] http://git.annexia.org/?p=jonesforth.git;a=blob;f=jonesforth.f;h=5c1309574ae1165195a43250c19c822ab8681671;hb=HEAD#l1769
[5] https://www.bradrodriguez.com/papers/moving3.htm
[6] https://www.bradrodriguez.com/papers/moving1.htm
[7] gopher://gopher.conman.org/0Phlog:2025/06/04.1
---

Discussions about this page

Implementing DOES> in Forth, the entire reason I started this mess | Lobsters
https://lobste.rs/s/w1ludh/implementing_does_forth_entire_reason_i

Implementing DOES> in Forth, the entire reason I started this mess - Lemmy: Bestiverse
https://lemmy.bestiver.se/post/432839

Implementing DOES> in Forth, the entire reason I started this mess | Hacker News
https://news.ycombinator.com/item?id=44231594

Implementing DOES> in Forth, the entire reason I started this mess - The Boston Diaries - Captain Napalm - programming.dev
https://programming.dev/post/31952759

Email author at [email protected]