FEAT: String interpolation #5085

hiiamboris · 2022-02-21T19:13:10Z

This PR is technically a part of our Format effort, but:

it's not tied to anything exported by Format module
it's widely useful outside of Format scope and should not require one to include the latter

Intro

String interpolation is used in most languages as a readable way of inserting values into a string.

Though the first question everyone of us may have is "what, another rejoin?", if rejoin was good enough for me or Gregg, this design would never have been born.

Here are just a few common examples written using rejoin function:

avcmd: rejoin [{"} player {" "} vfile {" --audio-file "} afile {"}]
print rejoin ["Download OK: '" remote "' [" info/Content-Type ", " info/Content-Length " bytes]"]
do make error! rejoin [token1 " cannot follow " token2 " in the " name " part of " mold .mask]

Try to look at these expressions and visualize how the resulting string will look like, and if I've got all spaces and quotes right.
Not a human task, eh? Even telling the strings from code requires syntax highlighting or quote counting or guessing.

These are equivalent expressions written via string interpolation:

avcmd: #rejoin {"(player)" "(vfile)" --audio-file "(afile)"}
#print "Download OK: '(remote)' [(info/Content-Type), (info/Content-Length) bytes]"
#error "(token1) cannot follow (token2) in the (name) part of (mold .mask)"

Pretty clear, right?

Another, even more solid reason for this design is translation. Whatever way we choose to translate our software, we have to give whole messages to the translator.

For more background info you can read related Format discussion, preliminary design document and the original sad-emoji dialect design, but I'll try to explain the main points here.

List of so far registered use cases can be found here but those are only mine. Maybe Gregg can find his own somewhere. Below that is a description of design implemented in this PR - a result of my experience with interpolation.

Why a macro?

Since input is a string, we don't get word!s from it. Without words, nothing is bound during context and function creation. Then the moment we extract words from the string, we can only bind them to the global scope, or to an explicitly provided list of contexts. Gabriele expressed the concern very clearly

Macro helps us avoid this trap. It gets expanded and produces words before those words are bound, resulting in code that just works.

We also get the benefit of speed: if program is compiled, all macros are expanded at compile-time.

Function approach is still provided for advanced users as it has it's merit: macro cannot work on message that is generated at run time (it can, but it has no knowledge of contexts where to bind words).

Syntax

Every paren inside the string becomes code:
#rejoin "(x) + (y) = (x + y)" -> rejoin ["" (x) " + " (y) " = " (x + y)]
Why parenthesis? Because as everywhere else in Red, it visually hints at evaluation.

To treat a paren literally, it is escaped inside by a backslash:
#rejoin "total (n) hits (\ratio: (100% * n / total))" -> rejoin ["total " (n) " hits (ratio: " (100% * n / total) ")"]
This relies on backslash being invalid in Red, so if there's a plan to leverage \ then another escape sigil must be found.

Original string type is preserved (leading "" also serves this purpose):
#rejoin %"file-(n).(ext)" -> rejoin [%"file-" (n) "." (ext)]
#rejoin <x (y)=(z)> -> rejoin [<x > (y) "=" (z)]

Extensions

Format module will also include a #format macro, backward-compatible with #rejoin but with two more features:
- it will convert (expression as mask) into (format (expression) mask)
- it will support per-message locale specification:
  #format/in "you've spent (spent as {$0.00}) of (spent + left as {$0.00})" locale
  will be preprocessed into
  #rejoin "you've spent (format/in (spent) {$0.00} locale) of (format/in (spent + left) {$0.00} locale)"
  This will require Issues in paths are not lexed #5009 to be solved.

It is expected that users will write their own macros for common cases based on #rejoin, e.g.:

#macro [#log any-string!] func [[manual] s e] [insert remove s [log #rejoin] s]

#log "(now) test message: (1) + (2) = (1 + 2)"		;) `#log` now is synonymous with `log #rejoin`
#log "(now) test message: (2) * (3) = (2 * 3)"

Another extension idea concerns #error macro. Right now the only fully custom error we have is the User Error. I propose adding another fully custom error into every error category, so we could produce Script, Math, other errors with custom messages. #error should then default to Script error, with #error/math, #error/syntax, #error/user etc forms changing the error type.
For cases when template is only known at runtime (e.g. report generation with user-defined template), rejoin function is extended to accept any-string! argument of exactly the same syntax as the macro:
- rejoin <img src=(url) size=(as-pair sizex sizey)> -> <img src=http://../image.png size=100x100>
- to overcome the binding issue, rejoin/with accepts one or more contexts to bind produced expressions before reducing them
- as a feature, rejoin/trap allows one to replace evaluation errors with some text (per original Gregg's design)
- /with and /trap only apply to string case and have no effect on block argument
WISH: URLs to support parens () for string interpolation REP#112 should be considered to bring URL support to this design.

P.S. I need some help reducing the docstrings :)

hiiamboris · 2024-01-25T17:42:14Z

How it could in theory work without a macro:

Special lexer string-like syntax that gets transcoded as a block, e.g. `(x) + (y) = (x + y)` -> ["" (x) " + " (y) " = " (x + y)]

Seems to me much more complex solution than a macro.
Introduce lexical scoping to the language, and let strings infer it so they can be expanded properly.

Extremely complex solution with uncertain outcomes. But may resolve the loops leaking words issue?
Let string expansion routine access the stack and automatically bind the result to contexts it finds in the stack. It will only bind to entered functions and make object contexts, ignoring arguments pushed to the stack.

Seems simple enough. But contradicts our definitional scoping model where words are bound at entity creation time. Resulting block will be bound not where the string appears, but where it's expanded. Which arguably for strings may be a desired outcome, e.g. we define a set of template strings somewhere, then fetch them by words and expand in different places automatically binding to different contexts. But still an inconsistency: e.g. if a function defined in an object uses such expansion, it will have access to function words but not to object words, because we most likely have left the make object scope by that time.

Also this option will need a function flag, akin to [no-trace], e.g. [no-bind] to tell the expand function to skip itself. E.g. a log wrapper that calls expand would want to hide its own context from being bound to. A possible wrapper around log wrapper as well, and so on.

hiiamboris added 3 commits February 21, 2022 21:49

FEAT: string interpolation facilities

ebd2328

TESTS: for string interpolation

5a1c3d9

FIX: minor docstring improv

36eeaa4

hiiamboris marked this pull request as draft February 21, 2022 19:13

hiiamboris requested review from dockimbel, greggirwin and qtxie February 21, 2022 19:14

FIX: R2 error handling

26f96f1

hiiamboris mentioned this pull request Oct 27, 2022

mold/form/rejoin/what-have-you: top-level overview red/REP#134

Open

hiiamboris mentioned this pull request Mar 6, 2023

WISH: VID to evaluate parens red/REP#141

Open

hiiamboris mentioned this pull request Dec 5, 2023

Preprocessor use cases compendium 📔 red/REP#156

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: String interpolation #5085

FEAT: String interpolation #5085

hiiamboris commented Feb 21, 2022 •

edited

hiiamboris commented Jan 25, 2024 •

edited

FEAT: String interpolation #5085

Are you sure you want to change the base?

FEAT: String interpolation #5085

Conversation

hiiamboris commented Feb 21, 2022 • edited

Intro

hiiamboris commented Jan 25, 2024 • edited

hiiamboris commented Feb 21, 2022 •

edited

hiiamboris commented Jan 25, 2024 •

edited