New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: explicit syntax for custom tags #240
Comments
Djot was beyond Markdown, keeping its legacy:
This proposal opens Pandora's box:
And I wonder what a LaTeX renderer (say, or SILE) would then do. Have to support I am afraid the problems supposedly solved might be worse. Or did I miss something? |
There's no any special handling of tags. For example, the SILE renderer would do exactly what SILE XML Flavor would do, namely, interpret the document as
This might, or might not produce a valid SILE document, depending on which custom SILE commands the user has defined. Stated positively, the user gains access to all their pre-existing custom SILE commands without having to define custom Djot renders or filters. So, if the user has
defined, they can use
in their djot |
Well I am afraid I have to disagree on everything then...
(I don't think it's the place to discuss the SILE examples, but the SIL language should be completely avoidable, and the user shouldn't need custom commands to do this kind of things. Styles are a better paradigm with a nicer separation of concerns) |
Still, an additional comment though:
Would the user really want to do this, with markdown.sile they don't need to define custom Djot renders or filters, indeed. The following works:
And all other things equal, it does work identically whether the input is a Markdown file, a Djot file, or a Pandoc JSON AST1. Footnotes
|
This is an interesting and well thought-out proposal. It does go in a somewhat different direction than I'd originally had in mind, but I see its good points. The original conception was that if you wanted to do something like
and then make use of a filter that replaces this with AST nodes including the raw HTML This proposal would allow you to do
which is a bit more verbose and relies more on English keywords, but it would work out of the box without filters. The proposed change would be breaking for existing djot documents that used
but maybe that is okay as the language is still in an experimental phase. The proposed change would make the djot AST less compatible with the pandoc AST (which doesn't have a notion of "tag name"), and this would make pandoc interoperability less smooth. In general I don't like to rely on English language keywords. Perhaps one could work around that, though, by introducing the concept of a "tag dictionary" that allows you to define your own aliases for tag names? If we did implement the prefix You are right that allowing a special name for spans restores symmetry with what we now have for divs. However, there's also a question of symmetry with verbatim containers (code spans and code blocks). For example, in LaTeX you might want
to produce a As for syntax, I fear that the tag name in |
Yeah, that's the big thing here! One can view Djot as eihter:
This proposal pushes us more towards the second interpretation (but note that they are not mutually exclusive --- some people may use djot as 1, and some might use it as 2) As you've rightfully notice, everything expressible with this proposal is already possible with custom attributes and classes, the "custom tags" thing just basically formalizes this pattern. And that nicely segues in @jgm first point! Even under this proposal I would expect people to write
and handle this as a filter by default. The "raw html" mode I think is needed solely as an escape hatch. However, under the new proposal its syntactically apparent that That's probably what I like aesthetically most here --- that we clearly separate the "semantics" attribute from the style ones (including adding invariant that there's at most one custom tag, but many classes).
I was under the impression that we already don't restrict class names and such to be English, but apparently that's not the case. It feels a bit strange that the following is parsed differently
I would say if we are fine with class names being English, we should be fine with tag-names being English also (but it might be a good idea to include some quoted syntax then just in case, eg
FWIW, this is something that worries me quite a bit. The page https://djot.net doesn't say that Djot is in an experimental phase, and makes it look like its quite finished. Ideally, we'd be more clear with communicating our stability promise.
Yeah, I think syntactically the salient bits are that:
As for particular syntax, |
I don't think there was any intention to exclude non-English class names! If we do it seems like a bug. The attribute grammar in attributes.ts does say that keywords need to be ascii, but not classes or identifiers. |
See also #197 and #192 where I proposed another use for I'm thus all for storing these "tags" specially in the AST. What worries me is that this proposal seems very HTML-centric for such a "central" syntax feature. I think it is important that djot is output-format agnostic, not favoring any one output format. While I do not yet use djot for real (the lack of a metadata — and other data in the spirit of #192 — syntax which is interoperable with Pandoc is the main show stopper for me) I really like most of the syntax features where djot differs/adds to Markdown, but my typical target format is PDF via LaTeX. If this means "tags" are stored separately in the ast and can be used for anything by parsers, filters and renderers I'm all for. If this means that "tags" become unusable unless you target HTML/XML, or even djot gets tied to those formats I'm actually worried! |
As a data point, someone laments the inability to create HTML/djont sandwiches without writing custom filters: |
This proposal is a synthesis of #239 and #146 and organized in TL;DR, What? and Why? sections, where the Why? is the most important.
TL;DR
Change djot such that the following input:
produces the following HTML:
What?
Specifically:
Change the parsing rule for
::: spam
to use"spam"
fortag_name
, rather than a class.Changing parsing rules for bare
:::
and[]
to settag_name
to"div"
and"span"
,respectively.
Add new concrete syntax
:tag-name[]
, that is,:(\S+)\[
where$1
, an arbitrary sequence of non-whitespace symbols, is atag_name
, and the rest is the usual span syntax. This concrete syntax produces aSpan
AST with the correspondingtag_name
set.Change default HTML renderer to use
tag_name
when renderingspan
anddiv
elements.The most invasive change here is
4
, as it adds a bit of new syntax to djot and directly enlarges the surface area.Why?
This single solution fixes several "problems" in the current version of djot, some big an some small. I list them roughly in order of priority:
Problem: users need a lightweight approach for producing custom HTML interspersed with normal djot.
Today, djot provides a
``` =HTML
syntax to embedded raw HTML (or any other format). The problem here is that its all-or-nothing: everything inside=HTML
needs to be HTML. You can't use that to wrap a part of a djot document into a custom tag:This is solvable by using a custom filter/renderer, but that's a significant step up in complexity, and might not be available to the user (e.g., a forum software using Djot for comments could alow raw HTML(with sanitization), but won't allow custom filters). In a more ad-hoc way, it's possible to split the raw block in two
but that's not quite as pretty as some might want!
With the proposed solution, the above can be written simply as
Naturally,
= HTML
doesn't go away: that's still the right tool for raw HTML, but we now gain a way to add HTML-Djot sandwiches.Note that while I say
HTML
, this feature applies to any roughly XML-shaped output format. For example, a docbook renderer could use that to emit arbitrary docbook elements, and a LaTeX renderer could emit apair.
Problem: extensibility properties of Djot are not obvious and need better explanation.
The core feature of Djot is that its syntax is fixed, but it is still extensible because the syntax is flexible enough to encode arbitrary attributed trees which could be interpreted specially by the renderers. This is a somewhat subtle and non-obvious point, and may not be immediately clear to the new users.
With this proposal, Djot gains an explicit first-class syntax for custom elements. We can clearly document that
::: plugin
and:plugin[]
is how one extends Djot. In terms of expressive power, this is exactly equivalent to[]{.plugin}
of course, but is easier to explain and search for.Overloading
.class
syntax to mean custom tags/elements is harder to teach.Problem: it's impossible to express arbitrary HTML in a Djot filter.
Djot has two programmatic extensibility mechanisms:
Filters are generally nicer, they are target-format-independent and composable (you can chain several filters together, because input and output have the same type). However, you can't use a filter to emit an HTML node not already used by a renderer, unless you resort to raw half-nodes, which is ugly, and output-format specific.
With this proposal, filters gain full power of HTML, while keeping a nice, well-typed tree structure. Fewer things need to be custom renderers, more things can can be filters.
Problem: the
::: spam
syntax is not orthogonalIn today's Djot, the following two are equivalent:
In the following example, both classes are on equal footing semantically, although syntactically one feels like it should be the primary:
The proposal fixes makes the syntaxes orthogonal by adding a new dimension.
::: spam
is no longer a class, it is a tag name.Problem: when reading custom elements existing "introducer last" syntax requires the reader to backtrack.
Consider a custom element in today's djot:
[Ctrl+C]{.kbd}
. Here, the+
would be interpreted specially by the renderer as a notation for shortcuts. However, if you read this left-to-right, you need to look ahead to{.kbd}
to get the context for interpreting the+
.In the proposal, this looks like
:kbd[Ctrl+C]
--- introducer keyword,kbd
, is leading, so a one-pass left-to-right visual scan tells you everything.Problem: smarter editors and IDEs need to know context to provide helpful suggestions.
Let's say you added a custom citation element to Djot, which looks like
[foo, p. 15]{.cite}
. A smart editor should be able to auto-completefoo
from your references library, but, if you are typing this left-to-write, by the time you get to[foo]
IDE doesn't yet know that it's going to be a cite.With the proposal, as soon as you've typed
:c
, the IDE can suggest auto-completing that to:cite[]
and then show completion list for actual citations.The text was updated successfully, but these errors were encountered: