Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Forth lexeme translator #75

Open
ruv opened this issue Sep 12, 2018 · 33 comments
Open

On Forth lexeme translator #75

ruv opened this issue Sep 12, 2018 · 33 comments

Comments

@ruv
Copy link
Contributor

ruv commented Sep 12, 2018

@rdrop-exit in #73, on September 3

Yes, Forth already has a handful of simple prefix words that should only be used judiciously and with care. That does not invalidate the point of "Let commands perform themselves", which extols the benefits of not being syntax driven. The Forth approach is more akin to a sophisticated interactive assembler than a traditional parsed language compiler/interpreter.

What is the "syntax driven" and why it is bad?

Whatever it was.

  1. The prefix words (or the parsing words) are a kind of syntax (or even grammar).
  2. Forth already has such words and it cannot live without them (for the moment).
  3. Every time when a prefix word is used it does not allow the next word to perform themself.
  4. The programmers (the users of a Forth system) need to implement the functionalities that are usually achieved via the prefix words, but the prefix words bring the set of problems (see [ertl98]). Nevertheless, the programmers create new prefix words since they don't have (or hasn't found) another way to achieve a desired functionality.

If we remove (or replace by something) the space between a prefix word and the next word — we will solve all the problems. This new "word" (as a single lexeme) now performs themself. It is not a real word (in the same way as the numbers), and so it does not have the mentioned problems.

In place of ' something and ['] something (that break copy-paste of code fragments from outside a definition into inside it, and vise versa) we can always use 'something (it is a quoting, it prevents something from execution and returns its xt).

In place of S" abc" we can use "abc" and can forget about S" word at all.

In place of to a we could use to->a or ->a or to:a.

Is to:a a special syntax? Perhaps yes, but in the same degree as to a is.

Regarding the Forth text interpreter loop (that is referenced by "Let commands perform themselves" tip) — it becomes simpler: it doesn't need to know anything even about words and numbers, it just calls the lexeme translator. And handling of words, numbers, strings, quotings, etc, — can be added into lexeme translator as simple as new words into vocabulary.

So now the discussion is going not about whether it is necessary or not, but about how to better implement it, and what API to choose. Many Forth systems support this feature for more than ten years already, and we need a single unified API now. Can anybody suggest some improvements in this regard?

Here is my two cents: Lexeme resolver mechanism API.

References

[ertl98] M. Anton Ertl, State-smartness Why it is Evil and How to Exorcise it

@RGD2
Copy link

RGD2 commented Sep 13, 2018 via email

@rdrop-exit
Copy link
Member

So now the discussion is going not about whether it is necessary or not, but about how to better implement it, and what API to choose.

Recognizers, and similar proposals to add more parsing machinery to Forth, are not necessary. Excising state-smartness from a Forth in no way requires their adoption, thankfully so as even if accepted into "standard" Forth it would be as an optional (experimental?) extension.

@gordonjcp
Copy link
Member

In place of S" abc" we can use "abc" and can forget about S" word at all.

But the space between "prefix" words is to let the parser find the end of the word. Now what you're going to have to do is identify the prefix, scan all the way to the end of an arbitrarily long string, match what the word is, put the string somewhere to be worked on, and then do the word.

This makes no sense.

You've now made the parser way more complicated than it needs to be and removed the ability of programmers to start a word with a quote symbol, because that will start the parser off looking for a string to compile.

That idea makes no sense. Why is it a good idea to make the parser harder to write and understand, just to lose a space between an uncommon class of words and their argument? The way it works right now is perfectly okay.

@ruv
Copy link
Contributor Author

ruv commented Sep 13, 2018

In reply to @gordonjcp

In place of S" abc" we can use "abc" and can forget about S" word at all.

But the space between "prefix" words is to let the parser find the end of the word. Now what you're going to have to do is identify the prefix, scan all the way to the end of an arbitrarily long string, match what the word is, put the string somewhere to be worked on, and then do the word.

It is not the Forth text interpreter business. If you need it — you do it. If you don't need — the capability itself does not bother you.

Actually to a is also possible. However, in such case to is not a regular word in the context and cannot be ticked as is; although it can be a regular word (even not immediate) in a special wordlist (and ticked when this wordlist is in the context).

You've now made the parser way more complicated than it needs to be and removed the ability of programmers to start a word with a quote symbol, because that will start the parser off looking for a string to compile.

Nothing is removed. Shadowing depends on the order. Usually you choose (and it is by default) to give higher precedence to the regular words. It is the same as regarding the wordlists — you choose which is the first in the search order.

That idea makes no sense. Why is it a good idea to make the parser harder to write and understand, just to lose a space between an uncommon class of words and their argument? The way it works right now is perfectly okay.

The way it works right now will continue to work. Nothing is removed. Only new capability is added; more precise, this capability is already presence in many Forth systems, but a common unified API should be designed and added.

@ruv
Copy link
Contributor Author

ruv commented Sep 13, 2018

In reply to @rdrop-exit

So now the discussion is going not about whether it is necessary or not, but about how to better implement it, and what API to choose.

Recognizers, and similar proposals to add more parsing machinery to Forth, are not necessary.

What is your (as a Forth system user) way to add support for the floating point numbers, without recompiling the Forth system? Or the hex numbers in form 0x12DF4?

Excising state-smartness from a Forth in no way requires their adoption,

State smartness becomes a problem when you try to postpone a state-smartness immediate word. So the point is to provide another way that allows to not use the state-smartness immediate words at all.

thankfully so as even if accepted into "standard" Forth it would be as an optional (experimental?) extension.

If this capability is even implemented in your Forth system — you (as the Forth system user) are not affected by it if you don't use it.

@ruv
Copy link
Contributor Author

ruv commented Sep 13, 2018

State smartness becomes a problem when you try to postpone a state-smartness immediate word. So the point is to provide another way that allows to not use the state-smartness immediate words at all.

Real code examples:

 : Z" POSTPONE S" POSTPONE DROP ; IMMEDIATE \ get ASCIIZ string

 : .SOP"   ( "NAME" -- ) POSTPONE S" POSTPONE (.SOP") ; IMMEDIATE

These words don't work correctly outside definitions.

@rdrop-exit
Copy link
Member

What is your (as a Forth system user) way to add support for the floating point numbers, without recompiling the Forth system? Or the hex numbers in form 0x12DF4?

What is wrong with recompiling a Forth system? That's the whole point of meta-compilation. Personally I'd be very wary of a Forth that doesn't come with full source and a meta-compiler.
I prefer hex numbers in the form $ 12df4.

State smartness becomes a problem when you try to postpone a state-smartness immediate word. So the point is to provide another way that allows to not use the state-smartness immediate words at all.

As I mentioned earlier solving state-smartness issues does not require recognizers.

These words don't work correctly outside definitions.

Nor should they since they define compile-time behavior, and therefore should be compile-only directives.

There are various approaches to having a name result in a non-default combination of compile-time and interpret-time behaviors, e.g. Stephen Pelc's NDCS proposal, various dual-xt approaches, Chuck Moore's approach in cmForth. Addressing state-smartness issues does not require bringing recognizers into the picture, they are superfluous, such needs can and should be addressed at a lower level whether or not the particular Forth implements any optional recognizers extension.

@ruv
Copy link
Contributor Author

ruv commented Sep 14, 2018

What is your (as a Forth system user) way to add support for the floating point numbers, without recompiling the Forth system? Or the hex numbers in form 0x12DF4?

What is wrong with recompiling a Forth system?

Nothing wrong. It is just convenient to have some features on the level of libraries.

I prefer hex numbers in the form $ 12df4.

Yes, you just don't have another variant.
Conceptually this approach leads to # prefix for numbers and CALL prefix to call a word.

So, let imagine an interval. At one end all lexemes are prefixed with prefix words. At another end there is no need for prefix words at all. How to choose the right point in this interval: what lexemes should be with a prefix, and what lexemes should be without a prefix?

It seems that your point is: let an implementer to define it (i.e. the Forth system core level). And a user is forced to use the prefix words in all other cases, except the system hardcoded variants.

My point is: let user to define it (i.e., the libraries and applications level).

State smartness becomes a problem when you try to postpone a state-smartness immediate word. So the point is to provide another way that allows to not use the state-smartness immediate words at all.

As I mentioned earlier solving state-smartness issues does not require recognizers.

Agree. Recognizer mechanism just provides technical capability to also solve the state-smartness issues on the user level (library), even if these issues are not solved by the Forth system itself.

These words don't work correctly outside definitions.

Nor should they since they define compile-time behavior, and therefore should be compile-only directives.

Agree.
But there is no a standard directive for compile-only. Only exception can be thrown.

There are various approaches to having a name result in a non-default combination of compile-time and interpret-time behaviors, e.g. Stephen Pelc's NDCS proposal, various dual-xt approaches, Chuck Moore's approach in cmForth.

Moore's cmForth approach can be implemented via recognizers.

Addressing state-smartness issues does not require bringing recognizers into the picture, they are superfluous, such needs can and should be addressed at a lower level whether or not the particular Forth implements any optional recognizers extension.

Agree. But in cmForth this issue is solved on the relatively high level.

OTOH there is no any standard API that addresses state-smartness issues. Probably we should pass a process of design such API, — similar to the way that is going with recognizers.

@rdrop-exit
Copy link
Member

I prefer hex numbers in the form $ 12df4.

Yes, you just don't have another variant.

I don't understand what you mean by that, one can make as many variants as one wants,
what's to prevent it?

Conceptually this approach leads to # prefix for numbers and CALL prefix to call a word.

In Forth the interpreter looks up and executes words for us, there is no need for a CALL prefix.

But there is no a standard directive for compile-only. Only exception can be thrown.

I assume the new standard will provide a way to designate compile-only definitions as part of whatever solution is adopted by the standard for dealing with non-default combination of compile-time and interpret-time behaviors.

At the moment I use the following approach for designating compile-only definitions:

: foobar .... ; compile-only

If I want a definition to be both immediate and compile-only I use :

: foobar ... ; directive

Where directive is defined as:

: directive ( -- ) immediate compile-only ;

OTOH there is no any standard API that addresses state-smartness issues. Probably we should pass a process of design such API, — similar to the way that is going with recognizers.

The proposals dealing with non-default combination of compile-time and interpret-time behaviors are a way of addressing the state-smartness issues.

@MitchBradley
Copy link

MitchBradley commented Sep 15, 2018 via email

@ruv
Copy link
Contributor Author

ruv commented Sep 15, 2018

In reply to @rdrop-exit

I prefer hex numbers in the form $ 12df4.

Yes, you just don't have another variant.

I don't understand what you mean by that, one can make as many variants as one wants,
what's to prevent it?

I mean a prefix parsing word ($ in your case). The only available standard variant is a prefix word. And this word either will be state-smart (bad), or will not work in some states (bad).

With recognizers mechanism one more variant is available: the numbers in form $12df4 can be supported along with usual numbers. In NDCS (or alike) approach a prefix word (e.g. $) can be defined in special way without the state-smartness issues.

For my taste, in the case of numbers the form $12df4 is just more convenient than $ 12df4.

What is wrong with having numbers in the form $12df4?

@rdrop-exit
Copy link
Member

rdrop-exit commented Sep 15, 2018

In reply to @rdrop-exit

I prefer hex numbers in the form $ 12df4.

Yes, you just don't have another variant.

I don't understand what you mean by that, one can make as many variants as one wants,
what's to prevent it?

I mean a prefix parsing word ($ in your case). The only available standard variant is a prefix word. And this word either will be state-smart (bad), or will not work in some states (bad).

That presumably won't be the case once the standard finally addresses the non-default combination of compile-time and interpret-time behaviors.

In the meantime, I have no state-smart issues with words like $ in my Forth:

: $ ( <token> -- u ) ($) ; interpret-only

: $ { <token> -- } ( -- u ) ($) & literal ; directive

(note: & is just an abbreviation I use for postpone)

With recognizers mechanism one more variant is available: the numbers in form $12df4 can be supported along with usual numbers. In NDCS (or alike) approach a prefix word (e.g. $) can be defined in special way without the state-smartness issues.

For my taste, in the case of numbers the form $12df4 is just more convenient than $ 12df4.

What is wrong with having numbers in the form $12df4?

Cosmetically nothing really, de gustibus non est disputandum.
For me personally it's easier to have a sense of what is happening under the hood with my version, $ is my hex number recognizer word, and voilà my hex number recognizing needs are now met, can't get much simpler than that.

@ruv
Copy link
Contributor Author

ruv commented Sep 16, 2018

The only available standard variant is a prefix word. And this word either will be state-smart (bad), or will not work in some states (bad).

That presumably won't be the case once the standard finally addresses the non-default combination of compile-time and interpret-time behaviors.

Did somebody make a proposal with explicit specification on this? I'm aware of some articles and papers, but they are not a specification.

In the meantime, I have no state-smart issues with words like $ in my Forth:

: $ ( <token> -- u ) ($) ; interpret-only

: $ { <token> -- } ( -- u ) ($) & literal ; directive

(note: & is just an abbreviation I use for postpone)

And your Forth system chooses one or another depending on state, does it?

How to create an alias h# using these $ words?
I can guess something like:

: h# [ ' $ compile, ] ; interpret-only
: h# postpone $ ; directive

But this approach relies on a quite confusing conception that FIND can return different xt depending on STATE. I would prefer a stable FIND that returns the same xt regardless the STATE.

With recognizers mechanism one more variant is available: the numbers in form $12df4 can be supported along with usual numbers. In NDCS (or alike) approach a prefix word (e.g. $) can be defined in special way without the state-smartness issues.
For my taste, in the case of numbers the form $12df4 is just more convenient than $ 12df4.

What is wrong with having numbers in the form $12df4?

Cosmetically nothing really, de gustibus non est disputandum.

For me personally it's easier to have a sense of what is happening under the hood with my version, $ is my hex number recognizer word, and voilà my hex number recognizing needs are now met, can't get much simpler than that.

It seems if we don't apply postpone to parsing words (and in they definition), we don't face state-smartness issues.

: $ ( <token> -- u ) ($) tt-lit ; immediate

: z"   [compile] s"   ['] drop tt-xt  ; immediate

Can anybody check this idea?

@rdrop-exit
Copy link
Member

rdrop-exit commented Sep 16, 2018

The only available standard variant is a prefix word. And this word either will be state-smart (bad), or will not work in some states (bad).

That presumably won't be the case once the standard finally addresses the non-default combination of compile-time and interpret-time behaviors.

Did somebody make a proposal with explicit specification on this? I'm aware of some articles and papers, but they are not a specification.

In the meantime, I have no state-smart issues with words like $ in my Forth:
: $ ( <token> -- u ) ($) ; interpret-only
: $ { <token> -- } ( -- u ) ($) & literal ; directive
(note: & is just an abbreviation I use for postpone)

And your Forth system chooses one or another depending on state, does it?

I have no state variable in my Forth, the interpreter would execute the interpret-only one, the compiler would execute the directive one.

How to create an alias h# using these $ words?
I can guess something like:

: h# [ ' $ compile, ] ; interpret-only
: h# postpone $ ; directive

That's one way yes.

But this approach relies on a quite confusing conception that FIND can return different xt depending on STATE. I would prefer a stable FIND that returns the same xt regardless the STATE.

I have no state variable, I use a stable alternative to find which I named lookup.
I use interpretation lookup to find the interpretation xt of a word, and I use compilation lookup to find the compilation xt of a word. For a normal word the xts will be the same, for a word such as $ that has a non-default combination of behaviors they will be different.

Keep in mind though, that this is how I addressed the state-smartness issues in my personal Forth, the standard committee will address these issues in their own way.

With recognizers mechanism one more variant is available: the numbers in form $12df4 can be supported along with usual numbers. In NDCS (or alike) approach a prefix word (e.g. $) can be defined in special way without the state-smartness issues.
For my taste, in the case of numbers the form $12df4 is just more convenient than $ 12df4.

What is wrong with having numbers in the form $12df4?

Cosmetically nothing really, de gustibus non est disputandum.

For me personally it's easier to have a sense of what is happening under the hood with my version, $ is my hex number recognizer word, and voilà my hex number recognizing needs are now met, can't get much simpler than that.

It seems if we don't apply postpone to parsing words (and in they definition), we don't face state-smartness issues.

: $ ( <token> -- u ) ($) tt-lit ; immediate

: z"   [compile] s"   ['] drop tt-xt  ; immediate

Can anybody check this idea?

There's the question of which xt should ' return, the compilation one or the interpretation one. In my Forth I deal with that issue by having two different ticks so that there's never any confusion, ' always returns the interpretation xt, and ^ always returns the compilation xt.

@paraplegic
Copy link

paraplegic commented Sep 17, 2018 via email

@ruv
Copy link
Contributor Author

ruv commented Sep 18, 2018

In reply to @rdrop-exit

In the meantime, I have no state-smart issues with words like $ in my Forth:
: $ ( <token> -- u ) ($) ; interpret-only
: $ { <token> -- } ( -- u ) ($) & literal ; directive
(note: & is just an abbreviation I use for postpone)

I would prefer that a same code fragment can be used outside definitions, inside definitions, inside postponing fragments, inside macro fragments — without changes. This feature can be achieved via recognizers (or via resolvers in my proposal).

The following example will not work in your approach, but can work via resolvers:

m: $ ($) tt-lit ;  \ NB: even shorter definition
: compile-stuff ]] dup $ AD <> if . else drop then [[ ; 
: test 123 [ compile-stuff ] ;

And your Forth system chooses one or another depending on state, does it?

I have no state variable in my Forth, the interpreter would execute the interpret-only one, the compiler would execute the directive one.

Conceptually the variable name does not matter. It could be even a DEFER that is switched among interpreter and compiler word — it is nevertheless a state in the sense of Finite-state machine.


[...]

can't get much simpler than that.

It seems if we don't apply postpone to parsing words (and in they definition), we don't face state-smartness issues.

: $ ( <token> -- u ) ($) tt-lit ; immediate

: z"   [compile] s"   ['] drop tt-xt  ; immediate

Can anybody check this idea?

There's the question of which xt should ' return, the compile-time one or the interpret-time one.

In the example above I meant only one xt per word and state-smartness.

\ a classic state-smart parsing word.
: $ ( "number" -- u | ) ($) state @ if lit, then ; immediate
\ a wrapper (alias)
: h# [compile] $ ; immediate
\ variation in the wrapper definition
: h# [ ' $ compile, ] ; immediate

My hypothesis is that without using POSTPONE word you can't show any state-smartness issue for these words.


An NDCS variant (by Stephen Pelc paper Special Words in Forth )

: $ ( "number" -- u | ) ($) ;  ndcs: ($) lit, ;

\ how to define a wrapper?
\ The following variants will not work
: h# [compile] $ ; immediate
: h# [ ' $ compile, ] ; immediate
: h# postpone $ ; immediate
: h# ['] $ execute ; immediate

How to make a wrapper in this case?

@rdrop-exit
Copy link
Member

rdrop-exit commented Sep 18, 2018

In reply to @rdrop-exit

In the meantime, I have no state-smart issues with words like $ in my Forth:
: $ ( <token> -- u ) ($) ; interpret-only
: $ { <token> -- } ( -- u ) ($) & literal ; directive
(note: & is just an abbreviation I use for postpone)

I would prefer that a same code fragment can be used outside definitions, inside definitions, inside postponing fragments, inside macro fragments — without changes. This feature can be achieved via recognizers (or via resolvers in my proposal).

Abstracting away the differences between interpretation and compilation is not on my radar. It's not something I would ever pursue in my Forths, so I can't really offer any constructive comments in this regard.

The following example will not work in your approach, but can work via resolvers:

m: $ ($) tt-lit ;  \ NB: even shorter definition
: compile-stuff ]] dup $ AD <> if . else drop then [[ ; 
: test 123 [ compile-stuff ] ;

Could you describe what the point is of this code is and why I would want to structure code this way? It's clear as mud to me.

And your Forth system chooses one or another depending on state, does it?

I have no state variable in my Forth, the interpreter would execute the interpret-only one, the compiler would execute the directive one.

Conceptually the variable name does not matter. It could be even a DEFER that is switched among interpreter and compiler word — it is nevertheless a state in the sense of Finite-state machine.

[...]

I don't think you're understanding what is meant by state-smart words in Forth. At any point in time your Forth's outer-interpreter is either in an interpreting state or a compiling state. Obviously the issue is not about eradicating this state in the outer interpreter. The state-smart word issue concerns the problems that can ensue from having a word that, as it executes, alters its behavior based on the then current state of the outer-interpreter. For example : fubar ( -- ) state @ if frobnicate else twiddle then ; immediate would be one such word. I believe the paper by Ertl that you mentioned explains how such words can be problematic.

can't get much simpler than that.

It seems if we don't apply postpone to parsing words (and in they definition), we don't face state-smartness issues.

: $ ( <token> -- u ) ($) tt-lit ; immediate

: z"   [compile] s"   ['] drop tt-xt  ; immediate

Can anybody check this idea?

There's the question of which xt should ' return, the compile-time one or the interpret-time one.

In the example above I meant only one xt per word and state-smartness.

\ a classic state-smart parsing word.
: $ ( "number" -- u | ) ($) state @ if lit, then ; immediate
\ a wrapper (alias)
: h# [compile] $ ; immediate
\ variation in the wrapper definition
: h# [ ' $ compile, ] ; immediate

My hypothesis is that without using POSTPONE word you can't show any state-smartness issue for these words.

An NDCS variant (by Stephen Pelc paper Special Words in Forth )

: $ ( "number" -- u | ) ($) ;  ndcs: ($) lit, ;

\ how to define a wrapper?
\ The following variants will not work
: h# [compile] $ ; immediate
: h# [ ' $ compile, ] ; immediate
: h# postpone $ ; immediate
: h# ['] $ execute ; immediate

How to make a wrapper in this case?

I haven't studied Stephen Pelc's proposal, perhaps he can chime in and answer your questions about it.

@ruv
Copy link
Contributor Author

ruv commented Sep 19, 2018

I have no state variable in my Forth, the interpreter would execute the interpret-only one, the compiler would execute the directive one.

Conceptually the variable name does not matter. It could be even a DEFER that is switched among interpreter and compiler word — it is nevertheless a state in the sense of Finite-state machine.
[...]

I don't think you're understanding what is meant by state-smart words in Forth. At any point in time your Forth's outer-interpreter is either in an interpreting state or a compiling state. Obviously the issue is not about eradicating this state in the outer interpreter. The state-smart word issue concerns the problems that can ensue from having a word that, as it executes, alters its behavior based on the then current state of the outer-interpreter.

I see. I just meant that a state-smart word can be aware of the current state without STATE variable.

For example : fubar ( -- ) state @ if frobnicate else twiddle then ; immediate would be one such word. I believe the paper by Ertl that you mentioned explains how such words can be problematic.

The following variant is state-smart as well, although it does not refer STATE variable:

: fubar ( -- ) ['] translator defer@ ['] compiler = if frobnicate else twiddle then ; immediate

@ruv
Copy link
Contributor Author

ruv commented Sep 19, 2018

The following example will not work in your approach, but can work via resolvers:

m: $ ($) tt-lit ;  \ NB: even shorter definition
: compile-stuff ]] dup $ AD <> if . else drop then [[ ; 
: test 123 [ compile-stuff ] ;

Could you describe what the point is of this code is and why I would want to structure code this way? It's clear as mud to me.

It is a subject of the metaprogramming.

For example I have ?ET word that is defined in standard way as following:

\ Return control to the calling definition if the top value is not zero,
\ otherwise drop the top value (that is zero).
: ?ET ( 0 -- | x -- x ) \ exit on true returning this true
  postpone dup postpone if postpone exit postpone then postpone drop
; immediate

Using ]] and [[ helper words from Gforth (see [1], [2]) it can be defined as following:

: ?ET ( 0 -- | x -- x )
  ]] dup if exit then drop [[
; immediate

I suggested slightly better readable variant:

: ?ET ( 0 -- | x -- x )
  postpone{  dup if exit then drop  }postpone
; immediate

Using this approach, the fragment dup if exit then drop is not needed to be changed to be postponed (comparing with the first standard variant).

If I need to use numbers or string literals in such fragments — I would prefer to use them "as is" without any special preparing.

@rdrop-exit
Copy link
Member

I have no state variable in my Forth, the interpreter would execute the interpret-only one, the compiler would execute the directive one.

Conceptually the variable name does not matter. It could be even a DEFER that is switched among interpreter and compiler word — it is nevertheless a state in the sense of Finite-state machine.
[...]

I don't think you're understanding what is meant by state-smart words in Forth. At any point in time your Forth's outer-interpreter is either in an interpreting state or a compiling state. Obviously the issue is not about eradicating this state in the outer interpreter. The state-smart word issue concerns the problems that can ensue from having a word that, as it executes, alters its behavior based on the then current state of the outer-interpreter.

I see. I just meant that a state-smart word can be aware of the current state without STATE variable.

For example : fubar ( -- ) state @ if frobnicate else twiddle then ; immediate would be one such word. I believe the paper by Ertl that you mentioned explains how such words can be problematic.

The following variant is state-smart as well, although it does not refer STATE variable:

: fubar ( -- ) ['] translator defer@ ['] compiler = if frobnicate else twiddle then ; immediate

Absolutely, and as you saw $ doesn't do this either.

@rdrop-exit
Copy link
Member

rdrop-exit commented Sep 19, 2018

The following example will not work in your approach, but can work via resolvers:

m: $ ($) tt-lit ;  \ NB: even shorter definition
: compile-stuff ]] dup $ AD <> if . else drop then [[ ; 
: test 123 [ compile-stuff ] ;

Could you describe what the point is of this code is and why I would want to structure code this way? It's clear as mud to me.

It is subject of metaprogramming.

For example I have ?ET word that is defined in standard way as following:

\ Return control to the calling definition if the top value is not zero,
\ otherwise drop the top value (that is zero).
: ?ET ( 0 -- | x -- x ) \ exit on true returning this true
  postpone dup postpone if postpone exit postpone then postpone drop
; immediate

Unfortunately I'm not really familiar with current standard compliant ways of doing such things, I can only show you how I would do it in my own Forth in the meantime.

My first instinct for a reusable low level word such as this would be to implement it as a primitive, in fact in my current Forth I have such a primitive, I named it 0ditch [edit: oops strike that, my 0ditch is different from your ?et, it exits on 0).

My second instinct would be to just make a normal word and call it, this definition would work in my current Forth:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if rdrop then ; compile-only

If for whatever reason I preferred the code to be inlined into another definition rather than called by another definition, I could do it this way:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if exit then ;inline compile-only

I also have a primitive called ?; which is equivalent to if exit then so another way would be:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup ?; ;inline compile-only

[snip...]

@ruv
Copy link
Contributor Author

ruv commented Sep 21, 2018

Unfortunately I'm not really familiar with current standard compliant ways of doing such things, I can only show you how I would do it in my own Forth in the meantime.

My first instinct for a reusable low level word such as this would be to implement it as a primitive, in fact in my current Forth I have such a primitive, I named it 0ditch [edit: oops strike that, my 0ditch is different from your ?et, it exits on 0).

Perhaps your 0ditch is like my ?E0 word (exit on zero returning this zero)?

My second instinct would be to just make a normal word and call it, this definition would work in my current Forth:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if rdrop then ; compile-only

If for whatever reason I preferred the code to be inlined into another definition rather than called by another definition, I could do it this way:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if exit then ;inline compile-only

I also have a primitive called ?; which is equivalent to if exit then so another way would be:

: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup ?; ;inline compile-only

Quite.

Just side note. RDROP (and Co.) may not work for control flow in some Forth systems. There was a proposal "The Open Interpreter Word Set" by M.L.Gassanenko (1999) that describes a portable approach for the Forth systems that can support manipulations with return addresses.

I can guess, even in your Forth system the first variant ?et (that uses rdrop for control flow) can not be used in the definitions that may be inlined. Perhaps the second and third as well (if you don't transform exit into jump to the end).

In-line expansion is useful capability, but it does not solve the problem of code generation. What if we want to define unless ... then as equivalent of 0= if ... then? Via postpone it can be defined as:

: unless postpone{ 0= if }postpone ; immediate

Definitely, it is interesting, how some problem can be solved in a certain Forth system.
But it is far more interesting, how some problem solution can be expressed in such way that can be supported by any Forth system, independent of threaded code model, special words handling implementation, etc, etc, etc.

@rdrop-exit
Copy link
Member

Unfortunately I'm not really familiar with current standard compliant ways of doing such things, I can only show you how I would do it in my own Forth in the meantime.

My first instinct for a reusable low level word such as this would be to implement it as a primitive, in fact in my current Forth I have such a primitive, I named it 0ditch [edit: oops strike that, my 0ditch is different from your ?et, it exits on 0).

Perhaps your 0ditch is like my ?E0 word (exit on zero returning this zero)?

My 0ditch is equivalent to ?dup 0;, or ?dup 0= if exit then.

My second instinct would be to just make a normal word and call it, this definition would work in my current Forth:
: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if rdrop then ; compile-only
If for whatever reason I preferred the code to be inlined into another definition rather than called by another definition, I could do it this way:
: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup if exit then ;inline compile-only
I also have a primitive called ?; which is equivalent to if exit then so another way would be:
: ?et ( 0|x -- |x ) ( r: a -- a| ) ?dup ?; ;inline compile-only

Quite.

Just side note. RDROP (and Co.) may not work for control flow in some Forth systems. There was a proposal "The Open Interpreter Word Set" by M.L.Gassanenko (1999) that describes a portable approach for the Forth systems that can support manipulations with return addresses.

Thanks, I don't intend on running my code on other than my own Forths.

I can guess, even in your Forth system the first variant ?et (that uses rdrop for control flow) can not be used in the definitions that may be inlined. Perhaps the second and third as well (if you don't transform exit into jump to the end).

The stack effect diagram is explicit about the return stack effect, it's up to the programmer to be mindful whether that is in fact the stack effect he needs in a particular situation.

In-line expansion is useful capability, but it does not solve the problem of code generation.

I'm not sure what the problem of code generation is in any practical sense. What is the pain caused by this problem?

What if we want to define unless ... then as equivalent of 0= if ... then? Via postpone it can be defined as:

: unless postpone{ 0= if }postpone ; immediate

Definitely, it is interesting, how some problem can be solved in a certain Forth system.
But it is far more interesting, how some problem solution can be expressed in such way that can be supported by any Forth system, independent of threaded code model, special words handling implementation, etc, etc, etc.

I already have solutions for my needs, I'm most interested in gauging the "portability penalty", i.e. overhead and implementation complexity, to comply with the standard proposals.

@GarthWilson
Copy link

GarthWilson commented Sep 21, 2018 via email

@rdrop-exit
Copy link
Member

rdrop-exit commented Sep 21, 2018

On 09/21/2018 11:36 AM, Mark W. Humphries wrote: My 0ditch is equivalent to ?dup 0;, or ?dup 0= if exit then. IF EXIT THEN   is just   ?EXIT Similar:  IF LEAVE THEN   is just   ?LEAVE etc..

Yes, but usually implemented as a primitive rather than the equivalent high level Forth.
There are exceptions, e.g. ?do.

@ruv
Copy link
Contributor Author

ruv commented Sep 21, 2018

In-line expansion is useful capability, but it does not solve the problem of code generation.

I'm not sure what the problem of code generation is in any practical sense. What is the pain caused by this problem?

Problem in mathematical sense, not everyday sense. This meaning is connected with conditions and questions, answers and solutions. Not with a pain.

The code generation problem established above consists in question how to avoid the necessity of transformation of a source code fragment that should be postponed. postpone{ ... }postpone markup can solve this problem. inline directive — does not solve this problem, although it can be a solution in some cases.

Regarding "standard proposals". There should be a common basis for discussion.

@rdrop-exit
Copy link
Member

In-line expansion is useful capability, but it does not solve the problem of code generation.

I'm not sure what the problem of code generation is in any practical sense. What is the pain caused by this problem?

Problem in mathematical sense, not everyday sense. This meaning is connected with conditions and questions, answers and solutions. Not with pain.

The code generation problem established above consists in question how to avoid the necessity of transformation of a source code fragment that should be postponed. postpone{ ... }postpone markup can solve this problem. inline directive — does not solve this problem, although it can be a solution in some cases.

If your goal is simply to have a version of postpone which takes a string as input instead of a single token then postpone" ... " and/or " ... " postponed would be the usual Forth idioms.

Regarding "standard proposals". There should be a common basis for discussion.

@cwpjr
Copy link
Member

cwpjr commented Sep 29, 2018 via email

@phreda4
Copy link
Member

phreda4 commented Sep 30, 2018

I not have state at all,
I use prefix like colorforth, hex are $ff, bin are %101, decimal and fixed point not have prefix

@massung
Copy link
Member

massung commented Sep 30, 2018

I use prefix like colorforth, hex are $ff, bin are %101...

Just noting that every Forth I've ever used, I've modified binary mode to allow for '.' characters in place of zeroes. I find it much easier on the eyes and quick to see what the actual values are. This was especially nice when being used for mono-chromatic and sprites in memory, too.

%.111....
%11111...
%11111...
%.1.1....
%11.11...

So much easier on the eyes... ;-)

@phreda4
Copy link
Member

phreda4 commented Sep 30, 2018 via email

@cwpjr
Copy link
Member

cwpjr commented Oct 3, 2018 via email

@ruv
Copy link
Contributor Author

ruv commented Oct 5, 2018

Does anybody here who supports the idea of common API for the resolvers mechanism?

Please, use the following reactions on this post:

  • 👍 +1 — yes.
  • 😕 confused — don't know yet.
  • 👎 -1 — never but it bothers me nevertheless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants