/
escaped-strings.html
640 lines (572 loc) · 24.1 KB
/
escaped-strings.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
<html>
<head>
<title>Escaped Strings S\"</title>
<style type="text/css">
ol li { margin-left: -1em; }
.item { width: 5em; }
ul.results { margin-left: 0em; padding-left: 20px; }
ul.results li { margin-bottom: 1ex; }
ul.results li:first-line { font-weight: bold; }
</style>
</head>
<body>
<h3>Escaped Strings <code>S\"</code></h3>
<p>
[ <a href="rfds.html">RfDs/CfVs</a> | <a href="proposals.html">Other proposals</a> ]
</p>
<h3>Poll Standings</h3>
See <a href="#voting">below</a> for voting instructions.
<h4>Systems</h4>
<dl>
<dt> [ ] conforms to ANS Forth.</dt>
<dd> VFX Forth (Windows/DOS/Linux) (Stephen Pelc)
<br> SwiftForth and SwiftX (Leon Wagner)
<br> bigForth (Bernd Paysan)</dd>
<dt> [ ] already implements the proposal in full since release [ ].</dt>
<dd> VFX Forth (Windows/DOS/Linux) [since 3.80] (Stephen Pelc)
<br> SwiftForth and SwiftX [since 3.0] (Leon Wagner)</dd>
<dt> [ ] implements the proposal in full in a development version.</dt>
<dt> [ ] will implement the proposal in full in release [ ].</dt>
<dt> [ ] will implement the proposal in full in some future release.</dt>
<dd> Bernd Paysan</dd>
<dt> [ ] There are no plans to implement the proposal in full in [ ].</dt>
<dt> [ ] will never implement the proposal in full.</dt>
</dl>
<h4>Programmers</h4>
<dl>
<dt> [ ] I have used (parts of) this proposal in my programs.</dt>
<dd> Dick van Oudheusden
<br> Stephen Pelc
<br> Leon Wagner
<br> Graham Smith
</dd>
<dt> [ ] I would use (parts of) this proposal in my programs if the systems I am interested in implemented it.</dt>
<dd> Stephen Pelc
<br> Mark Wills
<br> Bernd Paysan
<br> Gerry Jackson
<br> David N. Williams
<br> Graham Smith
</dd>
<dt> [ ] I would use (parts of) this proposal in my programs if this proposal was in the Forth standard.</dt>
<dd> Stephen Pelc
<br> Mark Wills
<br> Gerry Jackson
<br> David N. Williams
<br> Graham Smith</dd>
<dt> [ ] I would not use (parts of) this proposal in my programs.</dt>
<dt> [ ] I couldn't care less or spoil ballot in computer readable form.</dt>
<dd> Jacko</dd>
</dl>
<h4>Informal Results</h4>
<ul class="results">
<li>Dick van Oudheusden<br/>
The <a href="http://code.google.com/p/ffl/">Forth Foundation Library</a> uses this proposal.<br/>
Missing functionality in this proposal would be:<br/>
<code>parse\"</code> ( "ccc<quote>" -- c-addr u = Parse the input
stream for a escaped string )
(see also 6.2.2008 PARSE)</li>
<li>Peter Fälth<br/>
I am still missing <code>\u</code> and <code>\U</code> to handle
unicode codepoints (4 or 8 digits) in a portablle way.</li>
<li>Graham Smith<br/>
I have more use for 'escaping' text strings at run time. So, the
words:
<blockquote>
ESCAPE (c-addr len -- c-addr' len') and<br/>
UNESCAPE (c-addr len -- c-addr' len')
</blockquote>
should be defined.
<p/>
These words could then form the basis of S\" and parse\".
<p/>
I have had to 're-invent' the escape mechanism leading to potential
consistencies in the way S\" and my ESCAPE works!</li>
</ul>
<table width="500"><tr><td>
<h3>Problem</h3>
The word <code>S"</code> 6.1.2165 is the primary word for generating strings.
In more complex applications, it suffers from several deficiencies:
<ol>
<li> the <code>S"</code> string can only contain printable characters,</li>
<li> the <code>S"</code> string cannot contain the '<code>"</code>' character,</li>
<li> the <code>S"</code> string cannot be used with wide characters as discussed
in the Forth 200x internationalisation and XCHAR proposals.</li>
</ol>
<h3>Current practice</h3>
At least SwiftForth, gForth and VFX Forth support <code>S\"</code> with very
similar operations. <code>S\"</code> behaves like <code>S"</code>, but uses the '<code>\</code>' character
as an escape character for the entry of characters that cannot be
used with <code>S"</code>.
<p>
This technique is widespread in languages other than Forth.
</p>
It has benefit in areas such as
<ol>
<li>construction of multiline strings for display by operating system services,</li>
<li>construction of HTTP headers,</li>
<li>generation of GSM modem and Telnet control strings.</li>
</ol>
<p>
The majority of current Forth systems contain code, either in the
kernel or in application code, that assumes char=byte=au. To avoid
breaking existing code, we have to live with this practice.
</p>
The following list describes what is currently available in the
surveyed Forth systems that support escaped strings.
<table>
<tr><td><code>\a</code></td><td>BEL</td> <td>(alert,</td> <td>ASCII 7)</td></tr>
<tr><td><code>\b</code></td><td>BS</td> <td>(backspace,</td> <td>ASCII 8)</td></tr>
<tr><td><code>\e</code></td><td>ESC</td> <td>(escape,</td> <td>ASCII 27)</td></tr>
<tr><td><code>\f</code></td><td>FF</td> <td>(form feed,</td> <td>ASCII 12)</td></tr>
<tr><td><code>\l</code></td><td>LF</td> <td>(line feed,</td> <td>ASCII 10)</td></tr>
<tr><td><code>\m</code></td><td>CR/LF</td> <td>pair</td><td>(ASCII 13, 10) - for HTML etc.</td></tr>
<tr><td><code>\n</code></td><td>newline</td><td colspan="2">CR/LF for Windows/Dos, LF for Unices</td></tr>
<tr><td><code>\q</code></td><td colspan="2">double-quote</td><td>(ASCII 34)</td></tr>
<tr><td><code>\r</code></td><td>CR</td> <td>(carriage return,</td> <td>ASCII 13)</td></tr>
<tr><td><code>\t</code></td><td>HT</td> <td>(horizontal tab,</td> <td>ASCII 9)</td></tr>
<tr><td><code>\v</code></td><td>VT</td> <td>(vertical tab,</td> <td>ASCII 11)</td></tr>
<tr><td><code>\z</code></td><td>NUL</td> <td>(no character,</td> <td>ASCII 0)</td></tr>
<tr><td><code>\"</code></td><td colspan="2">double-quote</td><td>(ASCII 34)</td></tr>
<tr valign="top"><td><code>\</code>[0-7]+</td><td colspan="3">Octal numerical character value, finishes at the first non-octal character</td></tr>
<tr valign="top"><td class=item><code>\x</code>[0-9a-f]+</td><td colspan="3">Hex numerical character value, finishes at the first non-hex character</td></tr>
<tr><td><code>\\</code></td><td colspan="2">backslash itself</td><td>(ASCII 92)</td></tr>
<tr><td><code>\</code></td><td colspan="3">before any other character represents that character</td></tr>
</table>
<h3>Considerations</h3>
We are trying to integrate several issues:
<ol>
<li>no/least code breakage</li>
<li>minimal standards changes</li>
<li>variable width character sets</li>
<li>small system functionality</li>
</ol>
<p>
Item 1) is about the common char=byte=au assumption.
</p>
Item 2) includes the use of <code>COUNT</code> to step through memory and the
impact of char in the file word sets.
<p>
Item 3) has to rationalise a fixed width serial/comms channel
with 1..4 byte characters, e.g. UTF-8
</p>
Item 4) should enable 16 bit systems to handle UTF-8 and UTF-32.
<p>
The basis of the current approach is to use the terminology of
primitive characters and extended characters. A primitive character
(called a <em>pchar</em> here) is a fixed-width unit handled by <code>EMIT</code> and
friends as well as <code>C@</code>, <code>C!</code> and friends. A <em>pchar</em> corresponds to the
current ANS definition of a character. Characters that may be
wider than a <em>pchar</em> are called "extended characters" or <em>xchars</em>.
The <em>xchars</em> are an integer multiple of <em>pchars</em>. An <em>xchar</em> consists
of one or more primitive characters and represents the encoding
for a "display unit". A string is represented by caddr/len
in terms of primitive characters.
</p>
The consequences of this are:
<ol>
<li>No existing code is broken.</li>
<li>Most systems have only one keyboard and only one screen/display
unit, but may have several additional comms channels. The
impact of a keyboard driver having to convert Chinese or Russian
characters into a (say) UTF-8 sequence is minimal compared to
handling the key stroke sequences. Similarly on display.</li>
<li>Comms channels and files work as expected.</li>
<li>16-bit embedded systems can handle all character widths as they
are described as strings.</li>
<li>No conflict arises with the XCHARs proposal.</li>
</ol>
Multiple encodings can be handled if they share a common primitive
character size - nearly all encodings are described in terms of
octets, e.g. TCP/IP, UTF-8, UTF-16, UTF-32, ...
<h3>Approach</h3>
This proposal does not require systems to handle <em>xchars</em>, and does
not disenfranchise those that do.
<p>
<code>S\"</code> is used like <code>S"</code> but treats the '<code>\</code>' character specially. One
or more characters after the '<code>\</code>' indicate what is substituted.
The following three of these cause parsing and readability
problems. As far as I know, requiring characters to come in
8 bit units will not upset any systems. Systems with characters
less than 7 bits are non-compliant, and I know of no 7 bit CPUs.
All current systems use character units of 8 bits or more.
</p>
Of observed current practice, the following two are problematic.
<table>
<tr valign="top"><td class="item"><code>\</code>[0-7]+</td>
<td>Octal numerical character value, finishes at the first non-octal character</td></tr>
<tr valign="top"><td class="item"><code>\x</code>[0-9a-f]+</td>
<td>Hex numerical character value, finishes at the first non-hex character</td></tr>
</table>
Why do we need two representations, both of variable length?
This proposal selects the hexadecimal representation, requiring
two hex digits. A consequence of this is that <em>xchars</em> must be
represented as a sequence of <em>pchars</em>. Although initially seen as a
problem by some people, it avoids at least the following problems:
<ol>
<li>Endian issues when transmitting an <em>xchar</em>, e.g. big-endian host
to little-endian comms channel</li>
<li>Issues when an <em>xchar</em> is larger than a cell, e.g. UTF-32 on
a 16 bit system.</li>
<li>Does not have problems in distinguishing the end of the
number from a following character such as '0' or 'A'.</li>
</ol>
At least one system (Gforth) already supports UTF-8 as its native
character set, and one system (JaxForth) used UTF-16. These systems
are not affected.
<table>
<tr valign="top"><td class="item"><code>\</code></td>
<td>before any other character represents that character</td></tr>
</table>
This is an unnecessary general case, and so is not mandated. By
making it an ambiguous condition, we do not disenfranchise
existing implementations, and leave the way open for future
extensions.
<p>
Note that now the number-prefix extension has been accepted, 3.4.1
Parsing contains a definition of <<em>hexdigit</em>> to be a case insensitive
hexadecimal digit [0-9a-fA-F].
</p>
<h3>Proposal</h3>
<pre>
6.2.xxxx S\" s-slash-quote CORE EXT
</pre>
<dl>
<dt>Interpretation:</dt>
<dd>Interpretation semantics for this word are undefined.<p/></dd>
<dt>Compilation: ( "ccc<<em>quote</em>>" -- )</dt>
<dd>
Parse <em>ccc</em> delimited by <code>"</code> (double-quote), using the translation
rules below. Append the run-time semantics given below to the
current definition.<p/>
</dd>
<dt>Translation rules:</dt>
<dd>Characters are processed one at a time and appended to the
compiled string. If the character is a '\' character it is
processed by parsing and substituting one or more characters
as follows, where the character after the backslash is case
sensitive:
<table>
<tr><td><code>\a</code></td> <td>BEL</td>
<td>(alert,</td> <td>ASCII 7)</td></tr>
<tr><td><code>\b</code></td> <td>BS</td>
<td>(backspace,</td> <td>ASCII 8)</td></tr>
<tr><td><code>\e</code></td> <td>ESC</td>
<td>(escape,</td> <td>ASCII 27)</td></tr>
<tr><td><code>\f</code></td> <td>FF</td>
<td>(form feed,</td> <td>ASCII 12)</td></tr>
<tr><td><code>\l</code></td> <td>LF</td>
<td>(line feed,</td> <td>ASCII 10)</td></tr>
<tr><td><code>\m</code></td> <td>CR/LF</td>
<td>pair</td> <td>(ASCII 13, 10)</td></tr>
<tr><td><code>\n</code></td> <td>newline</td>
<td colspan="2">(implementation dependent newline,</td></tr>
<tr><td colspan="2"></td><td colspan="2">eg, CR/LF, LF, or LF/CR)</td></tr>
<tr><td><code>\q</code></td>
<td colspan="2">double-quote</td> <td>(ASCII 34)</td></tr>
<tr><td><code>\r</code></td> <td>CR</td>
<td>(carriage return,</td> <td>ASCII 13)</td></tr>
<tr><td><code>\t</code></td> <td>HT</td>
<td>(horizontal tab,</td> <td>ASCII 9)</td></tr>
<tr><td><code>\v</code></td> <td>VT</td>
<td>(vertical tab,</td> <td>ASCII 11)</td></tr>
<tr><td><code>\z</code></td> <td>NUL</td>
<td>(no character,</td> <td>ASCII 0)</td></tr>
<tr><td><code>\"</code></td>
<td colspan="2">double-quote</td><td>(ASCII 34)</td></tr>
<tr><td colspan="4"><code>\x</code><<em>hexdigit</em>><<em>hexdigit</em>></td></tr>
<tr><td colspan="2"></td><td colspan="2">
The resulting character is the conversion of these two
hexadecimal digits. An ambiguous conditions exists if <code>\x</code>
is not followed by two hexadecimal characters.
</td></tr>
<tr><td><code>\\</code></td>
<td colspan="2">backslash itself</td><td>(ASCII 92)</td></tr>
<tr valign="top"><td colspan="2"><code>\</code></td><td colspan="2">
An ambiguous condition exists if a <code>\</code> is placed before any
character, other than those defined in 6.2.xxxx <code>S\"</code>.
</td></tr>
</table><p/>
</dd>
<dt>Run-time: ( -- c-addr u )</dt>
<dd>
Return c-addr and u describing a string consisting of the
translation of the characters <em>ccc</em>. A program shall not alter
the returned string.<p/>
</dd>
<dt>See:</dt>
<dd>3.4.1 Parsing, 6.2.0855 <code>C"</code>, 11.6.1.2165 <code>S"</code>, A.6.1.2165 <code>S"</code></dd>
</dl>
<h3>Labelling</h3>
Ambiguous conditions occur:
<ul>
<li>If <code>\x</code> is not followed by two hexadecimal characters.</li>
<li>If a <code>\</code> is placed before any character, other than those defined
in 6.2.xxxx <code>S\"</code>.</li>
</ul>
<h3>Reference Implementation</h3>
Taken from the VFX Forth source tree and modified to remove
implementation dependencies. This code assumes the system
is case insensitive.
<p/>
Another implementation (with some deviations) can be found at
<a href="http://b2.complang.tuwien.ac.at/cgi-bin/viewcvs.cgi/*checkout*/gforth/quotes.fs?root=gforth">the gforth course tree</a>.
</td></tr></table>
<pre><code>
decimal
: c+! \ c c-addr --
\ *G Add character C to the contents of address C-ADDR.
tuck c@ + swap c!
;
: addchar \ char string --
\ *G Add the character to the end of the counted string.
tuck count + c!
1 swap c+!
;
: append \ c-addr u $dest --
\ *G Add the string described by C-ADDR U to the counted string at
\ ** $DEST. The strings must not overlap.
>r
tuck r@ count + swap cmove \ add source to end
r> c+! \ add length to count
;
: extract2H \ c-addr len -- c-addr' len' u
\ *G Extract a two-digit hex number in the given base from the
\ ** start of the string, returning the remaining string
\ ** and the converted number.
base @ >r hex
0 0 2over drop 2 >number 2drop drop
>r 2 /string r>
r> base !
;
create EscapeTable \ -- addr
\ *G Table of translations for \a..\z.
7 c, \ \a BEL (Alert)
8 c, \ \b BS (Backspace)
char c c, \ \c
char d c, \ \d
27 c, \ \e ESC (Escape)
12 c, \ \f FF (Form feed)
char g c, \ \g
char h c, \ \h
char i c, \ \i
char j c, \ \j
char k c, \ \k
10 c, \ \l LF (Line feed)
char m c, \ \m
10 c, \ \n (Unices only)
char o c, \ \o
char p c, \ \p
char " c, \ \q " (Double quote)
13 c, \ \r CR (Carriage Return)
char s c, \ \s
9 c, \ \t HT (horizontal tab}
char u c, \ \u
11 c, \ \v VT (vertical tab)
char w c, \ \w
char x c, \ \x
char y c, \ \y
0 c, \ \z NUL (no character)
create CRLF$ \ -- addr ; CR/LF as counted string
2 c, 13 c, 10 c,
: addEscape \ c-addr len dest -- c-addr' len'
\ *G Add an escape sequence to the counted string at dest,
\ ** returning the remaining string.
over 0= \ zero length check
if drop exit then
>r \ -- caddr len ; R: -- dest
over c@ [char] x = if \ hex number?
1 /string extract2H r> addchar exit
then
over c@ [char] m = if \ CR/LF pair
1 /string 13 r@ addchar 10 r> addchar exit
then
over c@ [char] n = if \ CR/LF pair? (Windows/DOS only)
1 /string crlf$ count r> append exit
then
over c@ [char] a [char] z 1+ within if
over c@ [char] a - EscapeTable + c@ r> addchar
else
over c@ r> addchar
then
1 /string
;
: parse\" \ c-addr len dest -- c-addr' len'
\ *G Parses a string up to an unescaped '"', translating '\'
\ ** escapes to characters. The translated string is a
\ ** counted string at *\i{dest}.
\ ** The supported escapes (case sensitive) are:
\ *D \a BEL (alert)
\ *D \b BS (backspace)
\ *D \e ESC (not in C99)
\ *D \f FF (form feed)
\ *D \l LF (ASCII 10)
\ *D \m CR/LF pair - for HTML etc.
\ *D \n newline - CRLF for Windows/DOS, LF for Unices
\ *D \q double-quote
\ *D \r CR (ASCII 13)
\ *D \t HT (tab)
\ *D \v VT
\ *D \z NUL (ASCII 0)
\ *D \" double-quote
\ *D \xAB Two char Hex numerical character value
\ *D \\ backslash itself
\ *D \ before any other character represents that character
dup >r 0 swap c! \ zero destination
begin \ -- caddr len ; R: -- dest
dup
while
over c@ [char] " <> \ check for terminator
while
over c@ [char] \ = if \ deal with escapes
1 /string r@ addEscape
else \ normal character
over c@ r@ addchar 1 /string
then
repeat then
dup \ step over terminating "
if 1 /string then
r> drop
;
create pocket \ -- addr
\ *G A tempory buffer to hold processed string.
\ This would normally be an internal system buffer.
s" /COUNTED-STRING" environment? 0= [if] 256 [then]
1 chars + allot
: readEscaped \ "ccc<quote>" -- c-addr
\ *G Parses an escaped string from the input stream according to
\ ** the rules of *\fo{parse\"} above, returning the address
\ ** of the translated counted string in *\fo{POCKET}.
source >in @ /string tuck \ -- len caddr len
pocket parse\" nip
- >in +!
pocket
;
: S\" \ "string" -- caddr u
\ *G As *\fo{S"}, but translates escaped characters using
\ ** *\fo{parse\"} above.
readEscaped count state @
if postpone sliteral then
; IMMEDIATE
</code></pre>
<h3>Test Cases</h3>
<pre><code>
HEX
T{ : GC5 S\" \a\b\e\f\l\m\q\r\t\v\x0F0\x1Fa\xaBx\z\"\\" ; -> }
T{ GC5 SWAP DROP -> 14 }T \ String length
T{ GC5 DROP C@ -> 07 }T \ \a BEL Bell
T{ GC5 DROP 1 CHARS + C@ -> 08 }T \ \b BS Backspace
T{ GC5 DROP 2 CHARS + C@ -> 1B }T \ \e ESC Escape
T{ GC5 DROP 3 CHARS + C@ -> 0C }T \ \f FF Form feed
T{ GC5 DROP 4 CHARS + C@ -> 0A }T \ \l LF Line feed
T{ GC5 DROP 5 CHARS + C@ -> 0D }T \ \m CR of CR/LF pair
T{ GC5 DROP 6 CHARS + C@ -> 0A }T \ LF of CR/LF pair
T{ GC5 DROP 7 CHARS + C@ -> 22 }T \ \q " Double Quote
T{ GC5 DROP 8 CHARS + C@ -> 0D }T \ \r CR Carriage Return
T{ GC5 DROP 9 CHARS + C@ -> 09 }T \ \t TAB Horizontal Tab
T{ GC5 DROP A CHARS + C@ -> 0B }T \ \v VT Vertical Tab
T{ GC5 DROP B CHARS + C@ -> 0F }T \ \x0F Given Char
T{ GC5 DROP C CHARS + C@ -> 30 }T \ 0 0 Digit follow on
T{ GC5 DROP D CHARS + C@ -> 1F }T \ \x1F Given Char
T{ GC5 DROP E CHARS + C@ -> 61 }T \ a a Hex follow on
T{ GC5 DROP F CHARS + C@ -> AB }T \ \xaB Insensitive Given Char
T{ GC5 DROP 10 CHARS + C@ -> 78 }T \ x x Non hex follow on
T{ GC5 DROP 11 CHARS + C@ -> 00 }T \ \z NUL No Character
T{ GC5 DROP 12 CHARS + C@ -> 22 }T \ \" " Double Quote
T{ GC5 DROP 13 CHARS + C@ -> 5C }T \ \\ \ Back Slash
</code></pre>
Note this does not test <code>\n</code> as this is a system dependent value.
<h3>Change History</h3>
<table width="500"><tr valign="top">
<td>2010-03-27</td><td>6.3</dt>
<td>Revised Reference Implementation removing endif and assumption on the length of a counted string</td>
</tr><tr valign="top">
<td>2009-03-31</td><td>6.2</dt>
<td>Revised Reference Implementation taking into account the fact that no standard word may use the PAD</td>
</tr><tr valign="top">
<td>2008-11-23</td><td>6.1</dt>
<td>Replaced description of <code>\"</code> (now the same as for <code>\q</code>).<br/>
Replaced the test cases with tests that do not assume the word can be used in interpretation mode.
In keeping with the definition</td>
</tr><tr valign="top">
<td>2007-10-30</td><td>6</dt>
<td>Clarification of case sensitivity:
Escape character is case sensitive,
Hex digits are not.</td>
</tr><tr valign="top">
<td>2007-09-13</td><td>5</dt>
<td>Added clarifications</td>
</tr><tr valign="top">
<td>2007-07-19</td><td>4</dt>
<td>Modified ambiguous condition.<br/>
Added ambiguous conditions to definition of <code>S\"</code>.<br/>
Added test cases.<br/>
Corrected Reference Implementation.</td>
</tr><tr valign="top">
<td>2007-07-12</td><td>3</dt>
<td>Redrafted non-normative portions</td>
</tr><tr valign="top">
<td>2006-08-22</td><td>2</dt>
<td>Updated solution section</td>
</tr><tr valign="top">
<td>2006-08-21</td><td>1</dt>
<td>First draft</td>
</tr></table>
<h3>Credits</h3>
Stephen Pelc, <stephen<em>@</em>mpeforth<em>.</em>com><br/>
MicroProcessor Engineering Ltd - More Real, Less Time<br/>
133 Hill Lane, Southampton SO15 5AF, England<br/>
tel: +44 (0)23 8063 1441,<br/>
fax: +44 (0)23 8033 9691<br/>
web: <a href="http://www.mpeforth.com">www.mpeforth.com</a> - free VFX Forth downloads
<p>
Peter Knaggs <pjk<em>@</em>bcs<em>.</em>org<em>.</em>uk><br/>
Engineering, Mathematics and Physical Sciences,<br/>
University of Exeter, Exeter, Devon EX4 7QF, England<br/>
web: <a href="http://www.rigwit.co.uk">www.rigwit.co.uk</a>
</p>
<a name="voting"><h3>Voting Instructions</h3></a>
Fill out the appropriate ballot(s) below and mail it/them to
<vote@forth200x.org>. Your vote will be published (including your
name (without email address) and/or the name of your system) here.
You can vote (or change your vote) at any time, and the results will
be published here.
<p>
Note that you can be both a system implementor and a programmer, so
you can submit both kinds of ballots.
</p>
<h4>Ballot for systems</h4>
If you maintain several systems, please mention the systems separately
in the ballot. Insert the system name or version between the brackets.
Multiple hits for the same system are possible (if they do not
conflict).
<dl>
<dt>[ ] conforms to ANS Forth. </dt>
<dt>[ ] already implements the proposal in full since release [ ]. </dt>
<dt>[ ] implements the proposal in full in a development version. </dt>
<dt>[ ] will implement the proposal in full in release [ ]. </dt>
<dt>[ ] will implement the proposal in full in some future release. </dt>
<dt>[ ] There are no plans to implement the proposal in full in [ ]. </dt>
<dt>[ ] will never implement the proposal in full. </dt>
</dl>
If you want to provide information on partial implementation, please do
so informally, and I will aggregate this information in some way.
<h4>Ballot for programmers</h4>
Just mark the statements that are correct for you (e.g., by putting an
"x" between the brackets). If some statements are true for some of your
programs, but not others, please mark the statements for the dominating
class of programs you write.
<dl>
<dt>[ ] I have used (parts of) this proposal in my programs.</dt>
<dt>[ ] I would use (parts of) this proposal in my programs if the systems
I am interested in implemented it.</dt>
<dt>[ ] I would use (parts of) this proposal in my programs if this
proposal was in the Forth standard.</dt>
<dt>[ ] I would not use (parts of) this proposal in my programs.</dt>
</dl>
If you feel that there is closely related functionality missing from the
proposal (especially if you have used that in your programs), make an
informal comment, and I will collect these, too. Note that the best time
to voice such issues is the RfD stage.
</body>
</html>