Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support tables #132

Open
justinmk opened this issue Apr 17, 2024 · 8 comments
Open

support tables #132

justinmk opened this issue Apr 17, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@justinmk
Copy link
Member

justinmk commented Apr 17, 2024

vimdoc has "column" marker ~ which is supposedly intended to markup a table, example:

tag		char		action in Insert mode	~
------------------------------------------------------------------------------ ~
|i_CTRL-@|	CTRL-@		insert previously inserted text and stop
				insert
|i_CTRL-A|	CTRL-A		insert previously inserted text
|i_CTRL-C|	CTRL-C		quit insert mode, without checking for
				abbreviation
|i_CTRL-D|	CTRL-D		delete one shiftwidth of indent in the current
				line
|i_CTRL-E|	CTRL-E		insert the character which is below the cursor
		CTRL-F		not used (but by default it's in 'cinkeys' to
				re-indent the current line)
|i_CTRL-G_j|	CTRL-G CTRL-J	line down, to column where inserting started
|i_CTRL-G_j|	CTRL-G j	line down, to column where inserting started

The parser could support this by recognizing anything that is separated by <tab> as a column. But it must be started by a foo <tab> bar <tab> ... ~ header. Tab-aligned things that don't have a column header would not be considered tables.

@lewis6991
Copy link
Member

Would it not be more robust to just go full markdown and require | to delimit cells?

@clason
Copy link
Member

clason commented Apr 17, 2024

Given the prevalence of tab-aligned lists in Vim documentation (and the fact that dashes are used as a heading marker already), I don't think this is going to work (robustly).

My current idea is to add ~~ for explicit table markup, and then use | as column separators (to be concealed). (The single trailing ~ for "columns" needs to die.)

Unfortunately, the use of "columnar" material in Vim documentation is a mess. I think we cannot do this without getting chrisbra involved.

@justinmk
Copy link
Member Author

justinmk commented Apr 17, 2024

Would it not be more robust to just go full markdown and require | to delimit cells?

That's also an option, but it requires updating a lot of docs. If the <tab> approach doesn't get good results or is hard to define a grammar for, we will need to consider a different syntax.

Given the prevalence of tab-aligned lists in Vim documentation

We would require a foo <tab> bar <tab> ... ~ column header for any table. Tab-aligned things that don't have a column header would not be considered tables.

@clason
Copy link
Member

clason commented Apr 17, 2024

I think there's no way around updating docs, given how lackadaisical vanilla vim docs are about proper markup. (For example, it's used in a bunch of lists of filenames for ... reasons. See, e.g., :h starstar-wildcard.) If we do this, we should do it right, just as we've been more strict about using closing codeblock markers. Otherwise this will become impossible to parse.

We would require a foo bar ... ~ column header for any table.

(And I just noticed an issue with this in the current parser: if optionlinks are used for table headers, e.g., :h backup-table.)

@lewis6991
Copy link
Member

but it requires updating a lot of docs.

Personally, I think it's worth doing. We don't even have to update everything immediately, since tables aren't currently parsed anyway, we just need the grammar to not produce ERROR nodes for old-style tables.

@clason
Copy link
Member

clason commented Apr 17, 2024

And a general cleanup (and doc ownership) is necessary IMO to ensure proper flow layout compatibility. We (I) already took the first step with removing parsing errors; adding language annotations would be a natural next step, too.

TL;DR: If we want to be able to fully parse our docs (which we do), we must take ownership and tighten up the markup to ensure our tools work. Upstream doesn't care, after all.

@risc26z
Copy link

risc26z commented Apr 20, 2024

Hi folks, I'm doing some experimental parsing work (external to tree-sitter), and have some observations that might hopefully be helpful.

Generally, I think moving towards '|' (as suggested above) would be easier to parse and less bug-prone than tabs. However, given the vast amount of existing documentation, I think it might be worth thinking about an (old) vimdoc to (new) vimdoc converter. With that in mind, I'm looking at the existing 'syntax' and asking two questions: a) can the existing material be parsed at all, and b) what would be the best way to support such a tool in a way that minimises the work that a human would need to do? If that requires a small amount of extra syntax to help the tool, it might be a worthwhile addition.

Using :h buffer-list as an example, the current syntax doesn't seem to use a single tab as a separator. The first table in this example uses a 1+ tabs to separate; the second uses a combination of tabs and spaces. In both cases, the alignment is visual (with the exception of the wonky row) rather than syntactical.

It's possible to automate this 'visual' recognition of columns under the condition that the whitespace before an aligned column (if any) must have at least one tab and/or 2+ spaces in it. Although this reduces false positives, they are still a potential problem: in each case, the question is not whether this can be done, but whether it should.

It is necessary to have some sort of clue that tabular layout is intended or, alternatively, that one should not be inferred. A column header is one such clue, but '~' at the end of a line is sometimes used for other purposes. See, eg., line 616 of windows.txt ("WinScrolled and WinResized autocommands ~").

I'd also point out that sometimes you might not want a header displayed (as in the third table in 'buffer-list'). This suggests either an invisible header syntax or (my preference) hidden markers to guide the tool.

Code blocks are a puzzle in terms of how they should fit in a tabular layout.

Sorry for writing too much!

@clason
Copy link
Member

clason commented Apr 20, 2024

can the existing material be parsed at all

No, and neither we nor Vim is even trying to.

See, eg., line 616 of windows.txt ("WinScrolled and WinResized autocommands ~").

That's the same purpose: columnar material. The fact that it's mixed with visual (space/tab) alignment is one of the worst parts of vimdoc as a markup language. (Spoiler: it's really not; it's a hodgepodge of things that got added for rendering :help files using a combination of syntax highlight and conceal).

Code blocks are a puzzle in terms of how they should fit in a tabular layout.

They shouldn't; apples and oranges.

If you want to help, the first step would be to get buy-in from vim/vim. If they agree on a stricter standardized format for tabular/columnar material that is easier to actually parse instead of just present visually, then we can proceed here.

This is a much more intrusive step than adding, say, language markers for codeblocks, which we could do unilaterally. We do not want to invent yet another NIH custom format; if we can't do this with vimdoc, we should just switch to Markdown and port doc changes manually (which we increasingly often need to anyway). This is a long-term effort that won't happen in the next version or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants