Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar rule won't match block-attribute-list that contains an inline macro #767

Open
andrewcarver opened this issue Jul 25, 2023 · 4 comments · May be fixed by #768
Open

Grammar rule won't match block-attribute-list that contains an inline macro #767

andrewcarver opened this issue Jul 25, 2023 · 4 comments · May be fixed by #768
Labels

Comments

@andrewcarver
Copy link

There is an unaddressed problem in the TextMate grammar that was brought over from Atom in Feb. 2020 (see file history for the grammar-file, Asciidoctor.json)).

As it stands, the regex in the #quote-paragraph rule will not match any block-attribute-list that contains, itself, an inline macro -- such as a URL macro or a bibliographic-citation macro. Downstream from that rule, there are weaknesses also in the #block-attribute-inner rule. The regex-fixes I'm going to show in a PR here (in just a minute) are the upshot of an earlier discussion of the whole matter, still viewable in Atom PR 197 -- where also there are screen-shots and syntax-examples; please see this earlier discussion for details.

(Sadly, that discussion was started almost 2 years after the grammar was ported here from Atom... Too bad I was barking up the wrong tree... Atom instead of VS Code! :-|)

This problem remains as of v. 3.1.3. Implementing them in that version, I've tested these fixes that I'm going to submit in my PR. The system information is:

Version: 1.79.2 (system setup)
Commit: 695af097c7bd098fbf017ce3ac85e09bbc5dda06
Date: 2023-06-14T08:57:04.379Z
Electron: 22.5.7
Chromium: 108.0.5359.215
Node.js: 16.17.1
V8: 10.8.168.25-electron.0
OS: Windows_NT x64 10.0.19045

@andrewcarver
Copy link
Author

andrewcarver commented Jul 27, 2023

I can put both the problem that this fixes, and how it fixes it, in a nutshell:

The original regexes did not support block-attribute-list's that contain, themselves, an inline macro:
this is because the "meat" of each of these regex's matches is from a character-group that dismisses the possibility of a literal ']' (i.e., \]) lying WITHIN the attribute-list:
either this one:
[^,\\]]+
or this one:
[^\\],.#%]+

The fix for this is to make this character group be only the first of a pair of alternatives -- the latter of which ALLOWS a literal ']' but only if "looking ahead" from it reveals that it is NOT the end of that line of text -- and thus, that it is not the CLOSING ']' of the whole block-attribute-list:
either
(?:[^,\\]]|\\](?=[ \\t]*\\S))+
or
(?:[^\\],.#%]|\\](?=[ \\t]*\\S))+

And the reason that THREE regexes had to be changed, rather than one, to fix this problem is:

  1. the regex on line 1713 doesn't match any text: it's just one large LOOK-AHEAD! (It leaves it to the regex on line 1716 to match the text.)
  2. the regex on line 340 is still looking to match the same text that line 1716's regex matched already; because it is invoked by the "captures" dictionary that starts on line 1717.

(Although https://macromates.com/manual/en/language_grammars says that "captures" dictionaries can "currently" only assign a name to a captured group's matched text, that document covers only TextMate 1.5.1; and from the discussion in https://www.apeth.com/nonblog/stories/textmatebundle.html (see about halfway down), we may surmise that Texmate 2 allows what we see in our own grammar starting at line 1717 -- viz., a "captures" dictionary that contains captures that assign not (or not only) a name, but (also) a "patterns" array -- to look for yet more matches within the same, already-matched text!)

@ggrossetie
Copy link
Member

Hey!

Thank you for taking the time to explain this issue in detail. In a nutshell, I'm not convinced that the TextMate grammar is suitable for highlighting AsciiDoc text. The work on the AsciiDoc specification has revealed all the ambiguities and limitations of a grammar like TextMate.

I am not opposed to merging this improvement, but I believe it would be better to work on a semantic token provider that will allow for a much more precise and accurate syntax highlighting.

https://code.visualstudio.com/api/language-extensions/semantic-highlight-guide#semantic-token-provider

I've created a meta issue (pinned) because there are a lot of cases where the TextMate grammar produces wrong highlighting.

@andrewcarver
Copy link
Author

Ah! Quite interesting. I will try to be more persevering in my perusal of that page (it didn't initially connect with my brain too wll, so I had dropped it :-/

Since you relate it to "the work on the AsciiDoc specification", I wonder whether this (already marching forward?) work you propose depends on that spec-work? or is it independent?

@ggrossetie
Copy link
Member

I won't say that we have to wait for a complete specification to start working on a semantic token provider.

If you are interested, you can join: https://chat.asciidoc.org and/or take a look at https://github.com/opendevise/asciidoc-parsing-lab/tree/main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants