Devlog 29 What Kind Of Threads
January 02, 2023
Happy New Year! In my last post, I was clearly confused about the type of jumps (direct/indirect) my Forth was making, and in fact I kinda of still have no idea what I’m talking about. In this session I’ll try to clear up a few things and hopefully set myself in the right direction for completing this Forth.
What kind of threads
The main question I’ve been asking myself since I first started writing this Forth is: “What kind of threading does it use?”. There’s a few types discussed in detail in Moving Forth by Brad Rodriguez:
- Indirect Threaded Code (ITC), with address codefields
- Direct Threaded Code (DTC), with
- Subrouting Threaded Code (STC), with
- Token Threaded Code (TTC), using a token lookup table
And probably a bunch of other hybrid approaches in between (ex: Segment Threaded Code).
After some careful reading over the last few days, I realized derzforth is designed using
ITC but performing double-indirect jumps, sectorforth is designed using
DTC which literally jumps directly to an address in a register but also behaves as an
ITC for colon definitions, and jonesforth is designed using
ITC and a indirect jumps as well.
In the end, everyone’s use-case is different, but Brad Rodriguez highlighted something important:
The only way to know for sure is to write sample code
So I’m going to take that approach and test different types. I’ll start with adjusting my code for
DTC with jump instructions, since I think I’m already halfway there.
Adjusting for DTC
The first step is to add one level of indirection when executing a word. We don’t want to jump directly to the address stored in the
s1) register. We want to jump to the address pointed by it, so in
execute we’ll make the following change:
- la s1, ok # load the address of the interpreter into the IP register + la s1, .loop # load the address of the interpreter into the IP register
and then outside of that function we’ll add a memory address for that:
.loop: .word ok # indirect jump to interpreter after executing a word
Now I’m not sure yet if this is exactly what should happen. At the moment it is jumping to the
ok function, which will print
ok and reset the
TIB before jumping back to the
interpreter_start function. Perhaps I don’t want to reset the
TIB just yet.
The next step is to modify the
NEXT macro so it actually performs that indirect jump:
- mv a0, s1 # load memory address from IP into W + lw a0, 0(s1) # load memory address from IP into W
Without this change, we would be jumping to the address of
.loop instead of the address pointed to by
ok). This is important because
COLON defined words will also need this indirection. Let’s modify
COLON while we’re at it:
- la a2, docol # load the codeword address into Y working register + la a2, .addr # load the codeword address into Y working register
It’s jumping to a different address, which I defined below:
.addr: .word docol # indirect jump to docol from a colon definition
This is similar to the
execute function and will allow
NEXT to behave the same way for both types of words (primitives and colon words). At least, that’s what I understood so far.
I can confirm this by typing
latest<Enter> in the terminal:
and then inspecting the
s3) register to see what it contains:
(gdb) i r s3 s3 0x80004f0 134218992 (gdb) x/xw 0x080004f0 0x80004f0 <word_SEMI>: 0x080004e4
Perfect! It contains the memory address which points to the
LATEST primitive word defined in the
I believe the above implemention is not complete, and it’s likely linked to incorrect assignment to the
IP instruction pointer, or the resetting of the
TIB (which makes no sense after only executing 1 word)… In the next session I’ll focus on fixing that issue before jumping to compile mode.