Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save & Resume and Correct Position #182

Open
elbakerino opened this issue Nov 13, 2022 · 0 comments
Open

Save & Resume and Correct Position #182

elbakerino opened this issue Nov 13, 2022 · 0 comments

Comments

@elbakerino
Copy link

elbakerino commented Nov 13, 2022

I've choosen MOO to make my first steps with own DSLs, as that isn't my focus area it could be that the "wish" is simply bad practice.

Parsed languages are Standard SQL and custom Modeling-Languages

In relation to: #142, #89, #12

In my e.g. SQL are non-standard placeholders, which are resolved while parsing the token, the placeholder may-reference to another SQL-code, which is then parsed and injected to the AST where the placeholder is.

I'm using a custom parser to produce the AST, implementing a shallow visitor pattern over Token.

My current problem comes from save/reset and the (looks like) impossibility to simply "resume where it was".

When using the following logic, it goes into an endless loop:

const text = 'the-code with reference'
lexer.reset(text)

// ... iterating based on `lexer.next()` in the parser
const saved = lexer.save() // POS-A

// do some other lexer.reset('partial-code'); lexer.next();

// resume from `POS-A`
lexer.reset(text, saved)

So from #89 i've used the slice strategy and wrapped the lexer.next() to get it from all parsed tokens to build an index-index like said in #142 - leading to "I don't use save at all":

export class ModelLangParser<N extends DslNodeBase> {
    protected readonly lexer: Lexer
    protected readonly visitors: DslVisitors<N> | undefined = undefined
    protected readonly saved: [string, LexerState][] = []
    protected readonly text: string
    protected position: number = 0

    constructor(
        lexer: Lexer,
        visitors: (parser: ModelLangParser<N>) => DslVisitors<N>,
        text: string,
    ) {
        this.lexer = lexer
        this.visitors = visitors(this)
        this.text = text
    }

    save() {
        // failed experiment, if it could be built correctly with `save` 
        const saved: [string, LexerState] = [this.text, this.lexer.save()]
        this.saved.push(saved)
    }

    resume() {
        this.lexer.reset(this.text.slice(this.position))

        /* const toResume = this.saved.pop()
        if(typeof toResume === 'undefined') {
            throw new Error('can not resume, nothing was saved')
        }
        this.lexer.reset(toResume[0], toResume[1]) */
    }

    parse(parent: N): N {
        this.position = 0
        this.lexer.reset(this.text)

        // here is the parser... (omitted for readability)
        // do { parsing } while(typeof next !== 'undefined')

        return parent
    }

    protected lexNext() {
        const next = this.lexer.next()
        if(next) {
            this.position += next.text.length
        }
        return next
    }
}

E.g. usage:

const visitorNamed: (parser: ModelLangParser) => DslVisitor = (parser) => (parent, token) => {
    // ... omitted registering new `Node`

    // parser.save() // not necessary with "position"
    
    // resolving some othe code by e.g. ID, maybe starting another parser from within
    const nestedAst = SomeLogic.resolveAndParse(token.value)
    parser.resume()

    // ... omitted injecting `nestedAst` as `children` to current `Node`

    return Next.close()
}

Now the resume works nicely from inside the visitors but destroys all "meta" info, like correct line number / column etc.

Is the endless-loop thing some wrong implementation? Can resume be built with save/reset (together with a correct line, col. number)?

If the own index-index is required for such a scenario, i would additionally need to keep track of lines and columns myself to rebuild the Token the visitor receives.

Yes it can be implemented in userland, but imho this may be easier and cleaner to implement with some more help from "inside moo".

Knowledge Base

maybe i've got some basic understanding wrong

I've understood that...

  • ... a moo.compile is done one time
  • ... the same lexer instance is used for every parsing
  • ... the lexer works with internal states to not parse something multiple times, thus parsers can be built in a streaming fashion without unnecessary iterations at all
  • ... the token from the lexer contain all information about position in code to be able to implement linter etc. with that
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant