Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

migrate Go keys and values to treesitter #1839

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

josharian
Copy link
Collaborator

@josharian josharian commented Sep 1, 2023

This doesn't handle all uses of key and value.

Missing still are assignment statements (name, value),
function signatures (name, value), and possibly other
places (perhaps type declarations?).

It does handle all existing uses of key, value, and name,
though, so we can start here.

The handling of multiple return values is complicated,
particularly around the handling of comments and removal ranges.
It would be nice to find a simpler approach.

Checklist

This doesn't handle all uses of key, value, and name.
Missing still are migrate Go key and value scope to treesitter

This doesn't handle all uses of key and value.

Missing still are assignment statements (name, value),
function signatures (name, value), and possibly other
places (perhaps type declarations?).

It does handle all existing uses of key, value, and name,
though, so we can start here.

The handling of multiple return values is complicated,
particularly around the handling of comments and removal ranges.
It would be nice to find a simpler approach.
@josharian
Copy link
Collaborator Author

josharian commented Sep 1, 2023

I haven't added any tests yet. Before I did that, I wanted to get a round of feedback about the behavior, so that I don't waste time adding tests with the wrong behavior.

There are a lot of tests to add.

To make it easier to get initial feedback I have included two videos of the scope visualizer. They go by fast but you can pause as needed. :)

A lot of the same considerations about comments, only-one vs many (and for many, first-vs-rest), removal ranges, etc. apply to the other uses of name and value, so it'd be good to get happy with the behavior and implementation before I continue with those.

(A surprising amount of development time is packed into not very many lines with .scm files. :P)

@josharian
Copy link
Collaborator Author

return values:

visualize_value.mov

@josharian
Copy link
Collaborator Author

keys and values in maps and lists:

visualize_key_value.mov

@pokey
Copy link
Member

pokey commented Sep 4, 2023

return values:

visualize_value.mov

Looking pretty good! One thing we're doing in other places is to have a scope that is the parent of the other scopes, whose domain is the entire statement and whose range is from the first return value to the last. See eg the name scope in .scm, and the comment proposing that behaviour #1631 (comment). Note that we only have that parent scope when there is more than one return value, otherwise you get useless nesting

Also, it's tough to tell what your iteration scopes are actually doing, but I think we'd want one iteration scope for iterating the values of one return statement when it has more than one return value, and a bigger iteration scope consisting of the function body

@pokey
Copy link
Member

pokey commented Sep 4, 2023

keys and values in maps and lists:

visualize_key_value.mov

this looks great!

Copy link
Member

@pokey pokey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid direction; left a bunch of minor inline comments, and see also my responses to your videos above 👍

@@ -60,7 +60,9 @@ export function checkCaptureStartEnd(
showError(
messages,
"TreeSitterQuery.checkCaptures.mixRegularStartEnd",
`Please do not mix regular captures and start/end captures: ${captures}`,
`Please do not mix regular captures and start/end captures: ${captures.map(
({ name, range }) => name + "@" + range.toString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the conversion to string not happen automatically?

@@ -71,7 +73,7 @@ export function checkCaptureStartEnd(
messages,
"TreeSitterQuery.checkCaptures.duplicate",
`A capture with the same name may only appear once in a single pattern: ${captures.map(
({ name }) => name,
({ name, range }) => name + "@" + range.toString(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as above

(_) @value @value.trailing.start.endOf
(#insertion-delimiter! @value ", ")
.
(comment)* @value.trailing.omit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this .omit do? Is this some hypothetical capture modifier you'd like to support in the future?

* @foo bar)` will accept the match if the `@foo` capture has 2 or more children
* not of type `bar`.
*/
class HasMultipleChildrenNotOfType extends QueryPredicateOperator<HasMultipleChildrenNotOfType> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow this really feels like a slippery slope 😅. Getting to the point where I wonder if we want to create our own mini-language

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dsl ftw! :D

Comment on lines +252 to +256
(expression_list
.
(_)
.
) @value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens with comments here?

Comment on lines +275 to +277
;; BUG: in this code:
;; return "lorem" /* comment */ , "ipsum"
;; the comment is included in the removal range of "lorem".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a bug? I might argue the comment is part of "lorem", tho as mentioned above, I'd lean towards being conservative about deleting people's comments

(expression_list
.
(_) @value @value.trailing.start.endOf
(#insertion-delimiter! @value ", ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting to put predicate right after the capture itself

","
.
(_) @value.trailing.end.startOf
) @_exprlist @value.leading.start.startOf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the _ prefix is your convention for things that are just dummy captures used for predicates? We've been using dummy, but I kinda like this _ prefix convention. wdyt @AndreasArvidsson

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

(_) @value.trailing.end.startOf
) @_exprlist @value.leading.start.startOf
(#not-type? @value comment)
(#has-multiple-children-not-of-type? @_exprlist comment)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a bit annoying we need to use negative specifier here rather than specifying the type itself, but I'm assuming the individual expressions have no wrapper nodes of uniform type, so I'm not sure what else we can do 🤷‍♂️

(#has-multiple-children-not-of-type? @_exprlist comment)
) @value.iteration

;; ...and the rest of the values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to have a comment explaining why you need to handle first / rest separately

@pokey pokey mentioned this pull request Sep 4, 2023
11 tasks
@auscompgeek auscompgeek added lang-go Issues related to Go programming language support scope-migration Migrating scopes to next-gen scope implementation labels Sep 9, 2023
(_) @collectionKey.trailing.end.startOf
) @collectionKey.domain
"}" @collectionKey.iteration.end.startOf
)
Copy link
Member

@pokey pokey Sep 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mixed feelings about defining a scope’s pattern and its iteration scope’s pattern in one go. I think it's often clearer to define them separately. You could arguably define collectionKey.iteration and @value.iteration in one pattern. I think that might be clearer

That being said, it's more of a rule of thumb than a hard rule; if you feel defining collectionKey and collectionKey.iteration in the same pattern is clearer in this case, I don't have a strong objection

There is a slight performance implication as well fwiw, because this pattern matches once per collectionKey scope instance, and thus will yield the iteration scope once per scope instance, and then we silently merge them later. Defining the pattern separately will only yield the iteration scope once

@josharian
Copy link
Collaborator Author

I need to take another pass over this, marking as draft in the interim.

@josharian josharian marked this pull request as draft October 31, 2023 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-go Issues related to Go programming language support scope-migration Migrating scopes to next-gen scope implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants