Janus

janus.ft.md


Janus

Janus is a Forth meta compiler. As a meta compiler, it is written in Forth to produce another Forth compiler. Janus has been tested with GForth as the host compiler and can so far compile CoreForth-0 and eForth.

Good background reading about meta compilers is Meta eForth DTC and Embed Forth, Janus takes inspiration from both projects.

Wordlists

Janus performs meta compilation by using Forth wordlists to generate a target image. The three wordlists are host, meta and target.

host wordlist

The host wordlist is the outer wordlist, active when Janus starts, containing words which are used to interact with the host system, such as file I/O. It also contains all words which handle the target image, e.g. adding words to the target image, or compiling data into the target image.

meta wordlist

The meta wordlist is active during processing of the target Forth. It contains immediate words which handle for example compilation of control flow. In a non-meta Forth, these words would be defined in terms of the target Forth, but in a meta compiled Forth setup, these words are based on the host Forth to build the target image.

target wordlist

The target wordlist contains the dictionary of the target words as they are created while processing the target Forth. These words will when called compile a call to themselves into the target code space.

Implementation

Set up the target and meta wordlists. The host wordlist is not explicitly created, the Forth wordlist is used as a simplification.

wordlist constant target-wordlist
wordlist constant meta-wordlist

Create helper words to switch between the word lists. ::host:: will reset the current wordlist to Forth for definitions and compilation. ::meta:: will do additionally add the meta wordlist as the first wordlist to be searched and use for definitions.

::target:: works a bit differently: When it is selected, words will be by default defined in the host (Forth) wordlist. This is however just the default, during meta compilation, target words will be added to the target wordlist (see tcreate below). Words will be searched first in the meta wordlist to handle control structures, and then in the target wordlist to lookup and compile target words into the target image.

: ::host::    only forth definitions ;
: ::meta::    only forth meta-wordlist >order definitions ;
: ::target::  only forth definitions target-wordlist >order meta-wordlist >order ;

Set up the host definitions

Switch to the host worklist:

::host::

Numbers in hex by default:

hex

Load support for writing Intel Hex files, usually used by the target Forth to write out the final image.

intel-hex.ft

Some basic string and path utility words:

utils.ft

Define the different threading types. Note that the support here only goes as far as allowing to set the type via the command line, but the actual implementation is outside of Janus within the target Forth.

0 constant STC
1 constant DTC
2 constant ITC

Define variables holding the command line arguments:

  create source-file $400 allot
variable tethered
  create tether-port $400 allot
  create tether-speed $10 allot
variable threading-type
  create #path $400 allot

       STC threading-type !
         0 tether-port c!
         0 source-file c!

Load the words used for command line argument parsing:

args.ft

Parse the command line arguments and update the variables above:

parse-args

Define the ROM and RAM base addresses. In a pure RAM based system, these would have the same value. These will be set by the target Forth.

        0 value trom
        0 value tram

Define the target Forth cell size, this is also set by the target Forth.

        0 value tcell

Define a value for the exception thrown if at the end of the compilation, the stack is not balanced:

        2 constant non-empty-stack

The target image size is 256k by default:

$00040000 constant target#

Meta compilation helpers

During meta compilation, the target image is compiled similar to a non-meta compiled Forth. Many of the definitions below are implemented in similar terms as in a regular Forth, but in order to avoid name conflicts, almost all are prefixed with t.

The target dictionary pointer, i.e. the address into the target image to which values are written during meta compilation:

variable tdp

The target variable pointer. In systems with a combined code and variable space, this variable is not used.

variable tvp

Declare and set the last pointer for the target to the beginning of the target image.

variable tlast 0 tlast !

Offset of the last allocated user variable

variable tuser 0 tuser !

State of the target compiler, this can be controlled while parsing target words. A state of one means numbers are compiled as literals into the target word, whereas zero leaves the number on the host stack to allow compiling then into the word using comma.

variable tstate

: t[        0 tstate ! ;
: ]t        1 tstate ! ;

Allocate the target image and fill it with 0xFF:

create #target target# allot
       #target target# -1 fill

Convert a target relative address to an index into the target binary:

: >target   #target + trom - ;

Target dictionary pointer

Operations on the target dictionary pointer, similar to their non-meta counterparts.

: there     tdp @ ;
: torg      tdp ! ;
: taligned  tcell 1- + tcell negate and ;
: talign    there taligned tdp ! ;
: tallot    tdp +! ;

Writing data to the target image

All operations use target relative addresses, which are converted to indices to the target image.

: t!        >target tcell 4 = if l! else w! then ;
: tc@       >target c@ ;
: t@        >target l@ ;
: tc!       >target c! ;
: th!       >target w! ;
: t,        there t! tcell tallot ;
: tc,       there tc! 1 tallot ;
: th,       there th! 2 tallot ;
: talloc    trom tram <> if tvp @ t, tvp +!
            else there tcell + t, tallot
            then ;

Finding words

Looking up words in the target wordlist, to e.g. get the body address:

: (tfind)   target-wordlist search-wordlist ;
: tfind     2dup (tfind) 0= if cr type ."  not found" cr 1 throw then
            nip nip >body @ ;
: t'        parse-name tfind ;

Compile the body address as a literal:

: [t']      t' postpone literal ; immediate

Compiling special actions

A target Forth can control with the words below how different key parts of a Forth system are compiled into the target image (the , in the names below indicates a store into the directory). The words are deferred, so that they can be imlpemented by the target Forth. The actual implementation depends also on the threading model.

Compile the action for pushing a constant:

defer t,docon

Compile the code needed when entering (sometimes called DOCOL) and exiting (EXIT) a Forth word:

defer t,enter
defer t,exit

Compile the action for starting a code word, this is only used for ITC threading:

defer t,docode

Compile the action for CREATE which pushes the data field address of the last created word:

defer t,dovar

Compile the actions for entering and exiting an interrupt handler (e.g. saving and restoring CPU state:

defer t,doirq
defer t,irqexit

Compile the NEXT action into the code:

defer t,next

Compile the invocation of another words:

defer t,call

Compile adding or updating a branch target into a word:

defer t,target
defer t!target

Compile the LIT action into the code (pushing the next word to the stack):

defer t,lit

Compile unconditional and conditional branches:

defer t,branch
defer t,?branch

Compile the prelude to a DOES> invocation:

defer t,dodoes

Add the action for a USER variable

defer t,douser

The target Forth can implement different header formats, such as normal headers, or headerless words. The words below interact with the target image: thead creates a word header in the image, tlink> (equivalent to LINK>) calculates the code field address, and timmediate sets the immediate flag of the current word.

defer thead
defer tlink>
defer timmediate

Set default actions for deferred words:

' t, is t,target
' t, is t,call 
' t! is t!target

Word creation

Create a word in the target wordlist using standard CREATE semantics, at creation time, the current pointer into the target image is stored, and at run time, that address will be compiled as a call into the target image.

: (tcreate) create there , does> @ t,call ;
: tcreate   get-current target-wordlist set-current (tcreate) set-current ;

Parse a word, but leave it in the input stream. Keeping the word in the input stream allows the dual word creation (target and target image):

: lookahead >in @ >r parse-name r> >in ! ;

Look up a word in the target wordlist, and compile a call to it into the target image:

: tfwdref   tfind t,call ;

Creates a word in the target dictionary, print its body address and name:

: t:        lookahead 2dup
            thead tcreate
            there hex. space type cr
            ;

End a word definition:

: t;        ;

Utility functions

Some utility functions follow. First a function to create the name part of a word header:

: pack$     dup >r 2dup c! 1+ 2dup + 0 swap ! swap cmove r> ;

Then, hex dump of the target image:

: tdump     #target there dump ;

Finally, a word to list the target wordlist definitions, and a word to show the RAM, ROM and dictionary pointers:

: twords    target-wordlist >order words previous ;
: tstat     trom u. tram u. tdp @ u. tvp @ u. there u. cr ;

Check at the end of the meta compilation if the stack is balanced, then list all defined words:

: final     depth if
              ." Stack not empty: " cr
              .s cr
              non-empty-stack throw
            then
            ;

Meta compiler definitions

Switch to the meta wordlist for the following definitions. The definitions make up the core of the meta compiler.

::meta::

Include the words needed to parse regular and code word definitions:

parse.ft

code.ft

To compile s", a couple of helper strings are useful:

create squote$ char s c, char " c,
create (squote)$ char ( c, char s c, char " c, char ) c,

Helper strings, used for looking up strings in the target wordlist when s" is already defined as a meta compiler word:

: type$     s" type" ;
: abort$    s" abort" ;
: cr$       s" cr" ;

The following words are immediate words, which will create control structures in the target image. They resemble their counterparts of a regular Forth system, except that they interact with the target image via the words defined in the host wordlist above. Some of the words are defined so that they either get compiled into the target word or invoke a host specific word. The behavior is controlled via the tstate variable.

: immediate get-current target-wordlist set-current immediate set-current
            timmediate
            ;

: postpone  parse-name target-wordlist search-wordlist dup 0= throw
            swap >body @ swap
            0< if t,lit t,target s" ,call" tfwdref
               else t,call
               then ;
: ,         tstate @ if s" ," tfwdref else t, then ;
: '         tstate @ if s" '" tfwdref else t' then ;
: create    tstate @ if s" create" tfwdref else t: t,dovar then ;

: if        t,?branch there tcell tallot ;
: else      t,branch there tcell tallot there rot t!target ;
: then      there swap t!target ;
: begin     there ;
: while     t,?branch there tcell tallot swap ;
: again     t,branch t,target ;
: until     t,?branch t,target ;
: repeat    t,branch t,target there swap t!target ;
: ahead     t,branch there tcell tallot ;
: does>     t,dodoes ;
: for       s" >r" tfwdref there ;
: next      s" (next)" tfwdref t,target ;
: aft       drop t,branch there tcell tallot there swap ;
: recurse   tlast @ tlink> t,call ;
: do        s" (do)" tfwdref there ;
: loop      s" (loop)" tfwdref t,?branch t,target ;
: +loop     s" (+loop)" tfwdref t,?branch t,target ;
: user      t: t,douser tuser dup @ dup t, tcell + swap ! s" up" tfwdref ;
: (         [char] ) word drop ;
:         refill drop ;
: [']       t,lit t' t,target ;
: [char]    t,lit char t, ;
: s"        (squote)$ 4 tfwdref
            $22 word dup dup c@ 1+ there #target + trom - swap cmove
            c@ 1+ tallot talign ;
: ."        s" type$ tfwdref ;
: abort"    if ." cr$ tfwdref abort$ tfwdref then ;
: exit      t,exit ;

: constant  t: t,docon t, ;
: variable  t: t,docon tcell talloc ;
: buffer:   t: t,docon talloc ;

Create words in the target image

Below are the words which are actually creating new definitions in the target wordlist and image. The principle is to first generate the target wordlist entry and target image header, compile the needed entry action, and then keep parsing and interpreting words from the input stream.

During parsing, each word will, when executed, compile a call to itself into the target image, or via the words defined above, compile control structures.

IRQ Handlers

IRQ handlers might need different entry and exit actions, thus two words are defined below which provide hooks for this. IRQ handlers are creared by first creating a definition in the target wordlist and image, compiling the code needed for entering an IRQ and then parsing and executing the rest of the input stream.

When ;i is encountered in the input stream, it will add the code for exiting an IRQ handler, finish the definition in the target wordlist, and set the flag for finishing parsing further words.

: i:        t: t,doirq read-word ;
: ;i        t,irqexit t; end-word ;

Regular Forth words

First, define meta: as an alias to the current colon definition start word, used to keep defining words in the meta wordlist.

: meta:     : ;

Colon is now redefined to define words in the target wordlist and image. After creating the definitions in the wordlist and image, compile the enter action (e.g. DOCOL for a DTC Forth), and keep parsing and executing words in the input stream as in the IRQ handler case.

: :         t: t,enter read-word ;

Use the previous colon definition to define the word used for finishing a definition in target wordlist and image, and signal the end of the parsing loop.

meta:       ; t,exit t; end-word ;

::meta::

Compile the target Forth

Load the Forth to be processed. If the source path is a relative path, prepend the current working directory to the source file to create an absolute path.

#path 0 source-file count copy-abs-path included

End compilation

End with the host wordlist active, output some final statistics and stop.

::host::

final

bye