Skip to content

STM8 eForth Word List Extensions

Thomas edited this page Oct 30, 2020 · 12 revisions

Word List and VOC Extensions for STM8 eForth

At the Convention of the German Forth Society 2018, Manfred Mahlow gave the talk Forth WORDLISTs im Flash were he presented a lightweight Word List implementation using wid-tags in Flash memory for Mecrisp-Stellaris and STM8 eForth. According to the author "word lists" in ROM were inspired by noForth, a MSP430 oriented µC-Forth.

This wiki page provides an overview of the implementation for STM8 eForth. From STM8 eForth release 2.2.21 on the Word List feature is delivered as a set of words in the lib folder that can be loaded with e4thcom or with codeload.py.

Word List Kernel Changes

The word CURRENT builds the infrastructure for "word list" extensions. With the help of the e4thcom (or codeload.py) dependency management feature #require, CURRENT patches and extends an existing STM8 eForth binary (note that patch-points are exported by the STM8 eForth kernel and test automation in STM8 eForth ensures the integrity of the patch).

Traditionally contexts / word lists / namespaces are implemented in Forth as linked lists. This is memory efficient and easy to do for traditional Forth systems that keep the dictionary in RAM. For small embedded Forth systems that keep a unified dictionary in Flash this isn't the best option. The single linked is easier to implement in a Flash based embedded Forth systems since adding and forgetting words doesn't require freeing up memory inside a list.

Instead, the CURRENT patch adds a "word list identifier" (the wid-tag), that indicates the word list a dictionary entry belongs to.

In order to keep compatibility with the dictionary in the ROM a flag in the word header (akin to the IMMEDIATE or the COMPONLY flags) indicates that words have a wid field. Like traditional Forth systems the new "word list" implementation has a default word list called FORTH. Words without a wid field belong to the FORTH word lis. Likewise, the wid field value 0 (zero) stands for the FORTH word list. Any other unique non-zero number can be used as a wid-tag.

The Word List feature extends the dictionary with an wid (Word List Identifier) and a tag-bit in the length-encoding header byte in the following way:

tag-bit t=0 (untagged)
               |Link|i,c,t,Length|Name|Body|

tag-bit t=1 (tagged)
           |wid|Link|i,c,t,Length|Name|Body|
i: IMMED, c: COMPO, t:TAGGE

The lexicon mask constants, used e.g. in find, now have the following values:

ID Mask Comment
IMEDD 0x80 lexicon immediate bit
COMPO 0x40 lexicon compile only bit
TAGGE 0x20 lexicon tag bit (previously unused in STM8 eForth)
MASKK 0x1F7F lexicon bit mask

Technically "tagging a word" means "compile the wid-id field in front of the word's link-field and set the lexicon tag bit (TAGGE = 0x20) in the count byte at the word's name address".

Loading the CURRENT library file with #require does the following:

  • define the variable CURRENT that holds the wid for new definitions ("0" for word list FORTH)
  • extend $,n to add a wid field and set the tag-bit of a new dictionary entry, unless CURRENT equals 0
  • redefine CONTEXT as an array
  • extend find and WORDS with tag-bit evaluation
  • make NAME? respect a search order of tagged and untagged words
  • make UNIQUE? search only in compilation context

This extension adds 310 bytes to the Flash size, and requires 6 bytes of RAM. Applying CURRENT patches the existing binary and then applies PERSIST to protect the changes (otherwise a RESET would render the patched Forth kernel unusable).

The optional library word FORTH sets the variable CONTEXT to 0 and thus provides a functionality well known from traditional Forth systems.

Using the Word List Feature

Manfred Mahlow implemented 3 different flavors of using word lists in STM8 eForth: basic WORDLIST, traditional Forth-83 style VOCABULARY, and the novelty VOC.

Basics with WORDLIST

WORDLIST applies word list identifiers by implementing a direct manipulation of CURRENT and CONTEXT.

The following example session demonstrates this feature:

#require WORDLIST
WORDLIST CONSTANT wid0 ok
wid0 CURRENT ! ok
CURRENT ? -27611 ok
: .S ( -- ) ."  This is .S in the word list wid0 " .S ; ok
words
 wid0  WORDLIST ?RAM CURRENT CONTEXT find IRET SAVEC ... 
.S
 <sp  ok

The new word WORDLIST creates an unique ID (wid) for a word list (i.e. the address of a byte in the Flash dictionary), stored as constant wid0, which gets assigned to CURRENT. The following re-definition of .S uses that wid, to the effect that the new definition remains invisible (WORD) and can't be found (.S still uses the original code).

Assigning wid0 to `CONTEXT changes that:

wid0 CONTEXT ! ok
.S This is .S in the word list wid0 
 <sp  ok
WORDS
 .S ok

The new word list is now on top of the dictionary search order and the new .S is visible and hides the word .S in the FORTH core. WORDS always only presents words in the CONTEXT word list, so the new .S is the only thing it shows here.

The new definition of .S is now visible and active (but it's the only thing that WORDS shows).

By resetting CONTEXT to 0 (that's the FORTH context), the original search order is restored.

0 CONTEXT !
WORDS         
 wid0  WORDLIST ?RAM CURRENT CONTEXT find IRET ...

Old School VOCABULARY

The traditional method emulates the Forth-79 words VOCABULARY, DEFINITIONS and ORDER.

Installation of VOCABULARY on a fresh STM8 eForth binary also brings some helper words:

#require VOCABULARY
\ ...
WORDS
  ORDER .VOC NVM VOCABULARY DEFINITIONS FORTH WIPE CURRENT CONTEXT find IRET ...

The following words are provided:

  • a new NVM prevents compiling code to flash while vocabulary in RAM is active (CURRENT > 0)
  • ORDER presents the search order for vocabularies defined by CONTEXT
  • .VOC shows the name of a vocabulary, e.g. CURRENT @ .VOC.

The following session demonstrates the VOCABULARY feature:

VOCABULARY myvoc ok
myvoc ok
words
 ok
ORDER myvoc FORTH ok
DEFINITIONS
: .S ( -- ) ."  This is .S in myvoc " .S ; ok
words
 .S ok
.S This is .S in myvoc 
 <sp  ok
FORTH ok
.S
 <sp  ok
ORDER FORTH FORTH ok

Note that VOCABULARY, unlike the basic WORDLIST feature, can also be used with temporary vocabularies in RAM.

Namespace Prefixes with VOC

VOC brings the idea of "namespace prefixes" to embedded Forth. The rationale is to keep the vocabulary uncluttered and the source code readable, while using short and targeted words (e.g. keep using C! for writing a byte to a serial EEPROM).

It's even possible to create a hierarchy of namespaces by defining a VOC prefix within another "namespace":

#require i2c
i2c DEFINITIONS
VOC eeprom  i2c eeprom DEFINITIONS
  $50 CONSTANT sid
  : C@ ( a -- c )  1 i2c eeprom sid i2c read ;
  : C! ( c a -- )  2 i2c eeprom sid i2c write ;
FORTH DEFINITIONS

The prefixes i2c and eeprom are IMMEDIATE (i.e. executed while compiling). In the definition of C! above, the literals 2 and sid, and the word write get compiled to the Flash ROM (NVM). By defining different prefixes an I2C library can have "branches" and "leaves" for different I2C devices (e.g. i2c rtc, i2c port, etc).

The new words C@ and C! can now be used like this:

$AA 200 i2c eeprom C!
200 i2c eeprom C@ .

The VOC front-end and the generic CURRENT infrastructure require a total of 599 bytes Flash and 8 bytes RAM memory. For applications in STM8 "Low density" devices this overhead might be too high, but for STM8 "Medium" or "High density" devices with 32K Flash ROM) please go ahead and run #require VOC in e4thcom after flashing the image - it's really worth it!