Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative string quote syntax #84

Open
mohkale opened this issue Sep 15, 2020 · 7 comments
Open

Alternative string quote syntax #84

mohkale opened this issue Sep 15, 2020 · 7 comments

Comments

@mohkale
Copy link

mohkale commented Sep 15, 2020

Evidently there's been a discussion about this in clojure (almost 6 years ago) but there wasn't an issue for this in the edn spec so here's one.

So far the only syntax for strings is with speech marks " as delimiters. This makes writing strings that contain " a chore because you have to escape them at every occurrence. It also makes it harder to tell where one argument begins and another ends because no white space is needed to separate arguments.

(
  ;; needs escaping
  {:cmd "echo \"foo ${bar}\""}

  ;; There's three values in the map below, but can you tell where the second
  ;; one finishes?
  {:cmd "foo bar \"baz\"""\"bag\" bam boom"}

  ;; What if the string ends up spanning multiple lines?
  {:cmd "conf_file=\"c:/tools/msys64/msys2_shell.cmd\"
         if ! [ -f \"$conf_file\" ]; then
           echo 'failed to set PATH inheritance, conf file doesn't exist' >&2
           exit 1
         fi
  
         sed -i -e 's/rem \\(set MSYS2_PATH_TYPE=inherit\\)/\\1/' \"$conf_file\""}
)

Suffice it to say escaping quotes hurts readability when quotes are used in abundance. This also affects JSON. I recently started using edn to configure my dotfiles so I've been writing quite a bit of shell script in edn and this is the sort of stuff I keep having to deal with.

The discussion I've linked to above described 3 alternatives, hopefully this issue opens a dialogue and gets the ball rolling on how best to tackles this problem.

Personally I quite like the triple quote python approach.

@zilti
Copy link

zilti commented Sep 18, 2020

Further inspiration: Chicken Scheme multiline strings https://wiki.call-cc.org/man/4/Non-standard%20read%20syntax#multiline-string-constant

@mohkale
Copy link
Author

mohkale commented Sep 19, 2020

Eg. of the Chicken Scheme syntax:

(define msg #<<END
 "Hello, world!", she said.
END
)

Personally I've never liked how the closer for heredocs have to span the whole line, You end up having to push any closing brackets or other constructs to the line after and it always looks unnatural to me 😞.

@zilti
Copy link

zilti commented Sep 22, 2020

I mean, you could theoretically specify that the closer does not have to span the whole line, of course. "The closer is everything up to the first closing parenthesis" or something.

@mohkale
Copy link
Author

mohkale commented Sep 22, 2020

@zilti You're correct. Strangely enough I've never encountered a language that allowed that. Both bash and ruby don't seem to allow it. However php does. I see no issue with heredocs if we're going down that route.

@zilti
Copy link

zilti commented Sep 22, 2020

True. And I guess in a way, XML's CDATA tag takes this even one step further, at the cost of not letting you customize the closer, opening with <[CDATA[ and closing with ]]>, no matter where on a line it is, and how much other stuff there is on that line.

@xpe
Copy link

xpe commented Jul 11, 2021

After spending considerable time comparing string literal syntaxes across languages, here are my assessments:

  1. Rust's string literals strike a nice balance between (a) ease-of-mechanical-parsing and (b) human-readability.

  2. YAML's many kinds of string literals result in a format that is (a) difficult for machine parsing and (b) painfully complex for humans to use much less remember. As a result, the YAML spec is unnecessarily complicated, resulting in implementations of varying quality. Please, learn from YAML's choices here -- they are a cautionary tale.

Note: I could have phrased my comment more neutrally, but that would be hiding my bias. (My bias may or not be useful to you, so interpret it accordingly.) My assessment is informed by wrestling with tradeoffs in this space while designing a new human-readable interchange format.

@djhaskin987
Copy link

I would like to chime in with an idea I have recently had.

A multiline string might begin with a backslash character, followed by blank space and then a line limiter. Subsequent lines would have some blank space at the beginning, then a backslash character. The rest of the line including the new line would be part of the multi-line string. The first line that doesn't start with a pipe character is not included and parsing continues as normal. The first and the last new line characters in the multi-line string are always removed.

Examples:

{
  :foo \
          \Bar
          \Baz
          \
   :Quxx "hi
}

Becomes

{
  :foo "Bar\nBaz\n"
  :Quxx "hi"
}

If alternative interpretations of the multiline string are needed, they can simply be dispatches. For example, #prose before a multi-line string might result in a multi-line strain with all of the new lines truncated into spaces, like YAML's >.

This makes several different types of multiline strings possible with a simple syntax that is easy to understand and remember.

It also allows you to embed one document inside another without escaping anything inside that document. All you have to do is prefix all the lines in the document with some space and a pipe. The ability to embed documents inside other documents is a killer feature for configuration languages. Being able to add it simply to edn like this would be amazing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants