Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capital letters breaks autocomplete in VS Code Extension #1347

Open
drhagen opened this issue Jan 18, 2024 · 3 comments
Open

Capital letters breaks autocomplete in VS Code Extension #1347

drhagen opened this issue Jan 18, 2024 · 3 comments
Labels
bug Something isn't working completion Completion related issue

Comments

@drhagen
Copy link

drhagen commented Jan 18, 2024

A grammar of a keyword followed by /[A-Z]+/ will not correctly autocomplete the keyword, but the same keyword followed by /[a-z]+/ will autocomplete just fine. This might be a bug on the VS Code side because the same grammar in the Langium Playground autocompletes fine.

Langium version: 2.1.3
Package name: hello-world

Steps To Reproduce

  1. npm install -g yo generator-langium
  2. yo langium
  • Keep defaults except do not create CLI or webworker
  • Accept open in VS code
  1. Replace hello-world.langium with:
grammar HelloWorld

entry Model:
    'header' value=ID;

hidden terminal WS: /\s+/;
terminal ID: /[A-Z]+/;
  1. Purge validation in hello-world-validator.ts because we don't need it:
import type { HelloWorldServices } from './hello-world-module.js';
export function registerValidationChecks(services: HelloWorldServices) { }
export class HelloWorldValidator { }
  1. npm run langium:generate
  2. npm run build
  3. Run extension in Code to open a new window with the extension installed
  4. Create a file test.hello
  5. In the file, try to auto-complete the first keyword he<tab>

image

The current behavior

When starting to type the keyword, the correct completion appears. But when pressing Tab or Enter to accept the autocomplete, it types in the whole keyword again instead of the remainder of the word.

image

Now switch ID from /[A-Z]+/ to /[a-z]+/. Rebuild and restart the extension. With this grammar autocomplete works as expected.

image

The expected behavior

Autocomplete completes the keyword instead of typing the whole keyword in again regardless of the token that follows.

@drhagen drhagen added the bug Something isn't working label Jan 18, 2024
@msujew
Copy link
Contributor

msujew commented Jan 18, 2024

Ok, fascinating. This is a really hard to catch edge case in very special grammars for completions within the first token of a file. I'm honestly suprised someone was able to create reproduction steps for this. Kudos, I guess. We basically run into this branch, which then later assumes that no tokens have been parsed. As a consequence it doesn't even attempt to fuzzy match the previous code to override it. This logic got fairly recently into Langium, whereas the playground lags behind a minor version, which is why it doesn't exhibit the behavior.

I'm not sure whether we can actually change this part of the logic though. The fuzzy matcher isn't allowed to look too far back in the token stream to find the text to replace. It should only look for the current token, which is exactly what's happening right now. In some cases, the current token just cannot be lexed, which leads to the behavior you're experiencing.

@msujew msujew added the completion Completion related issue label Jan 18, 2024
@drhagen
Copy link
Author

drhagen commented Jan 19, 2024

within the first token of a file

I minimized this down, but failed autocompletion can trigger further than the first token, unless we have different definitions of "token".

For example, using this grammar:

grammar ReactionModel

entry ReactionModel:
    EOL? '%%' 'ReactionModel@2' EOL
    'initialization' '=' initialization=Initialization EOL
    '%' 'components' EOL
;

Initialization:
    InitialValue | SteadyState;

InitialValue:
    {infer InitialValue} 'initial_value' '(' ')';

SteadyState:
    'steady_state' '(' 'time_scale' '=' time_scale=FLOAT (',' 'max_scale' '=' FLOAT )? ')';

hidden terminal WS: /[ \t]+/;
terminal EOL: /((#.*)?\n[ \t]*)*(#.*)?((\n[ \t]*)|\Z)/;
terminal FLOAT returns number: /[+-]?\d+(\.\d+)?([Ee][+-]?\d+)?/;

with this valid file

%% ReactionModel@2
initialization = steady_state(time_scale = 1.0, max_scale=1.0)
% components

not a single keyword autocompletes correctly while typing it in or when going back to edit it. It knows what can be autocompleted there (e.g. after "initialization =" then "steady_state" or "initial_value" are valid autocompletes), but it types in the whole word instead of completing the word.

@msujew
Copy link
Contributor

msujew commented Jan 25, 2024

@drhagen Let me rephrase: For example initial - in your language - isn't actually a token (even though initial_value is), since there's neither a keyword nor something like an ID terminal that could lex it. Instead, the lexer simply ignores the characters. Since we can only know where a token ends/starts if the lexer recognizes it's a token, the completion provider assumes that the characters before the cursor position are invalid characters and ignores them as well. This is actually independent of the issue that we don't lex any tokens at all - the issue is really that we have no idea "how much" of a token already exists at a given point.

In order to successfully perform completion, even "broken" keywords need to be recognized as tokens by the lexer. Most languages (i.e. all that I've encountered so far) have an ID terminal that can be expressed as /\w+/, which automatically fixes this issue.

I don't think we can fix this as part of our framework. You are free to override how the completion provider attempts its fuzzy matching, so you should be able to fix this behavior for your language yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working completion Completion related issue
Projects
None yet
Development

No branches or pull requests

2 participants