Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to type operation parameters & return values #450

Open
TheOnlyTails opened this issue May 1, 2023 · 11 comments
Open

Add ability to type operation parameters & return values #450

TheOnlyTails opened this issue May 1, 2023 · 11 comments

Comments

@TheOnlyTails
Copy link

Currently, using any semantic operations/attributes results in an any, which introduces a major risk of mistakes and typos into the codebase.

I suggest moving the semantic operation/attribute creation process into the createSemantics function, and extending the Node type of off that, instead of using [index: string]: any.

I'd be happy to help create a prototype/work alongside someone on this, since this seems like quite a big pain point for such an important feature.

@pdubroy
Copy link
Contributor

pdubroy commented May 4, 2023

@TheOnlyTails This is already supported — please see the documentation on Using Ohm with TypeScript.

If there's something missing from that, or you find any problems, please let me know!

@pdubroy pdubroy closed this as completed May 4, 2023
@TheOnlyTails
Copy link
Author

I read through the docs again, and I think you misunderstood what I meant: the type-checking when declaring semantic operations is great - the problem arises when it comes to accessing them - they're all typed as any. I'd like something like this:

grammar.createSemantics({
  // this object allows declaring semantic actions while type-checking their params and return type when called
  eval: (semanticParam: string) => ({
    Expression: (content) => //etc
  })
})

@pdubroy
Copy link
Contributor

pdubroy commented May 4, 2023

Sorry for the confusion, guess I was a bit distracted when I read this and didn't fully understand.

To confirm I understand what you're asking about, let's take this example:

const s = g.createSemantics().addOperation('myOp(a, b)`, {
  ...
})

You want to be able to specify both (1) the return type of the myOp operation, and (2) the type(s) of the parameters (a, b)?

It's already possible to specify the return type, see arithmetic.ts in the TypeScript example.

You're right that we're missing an ability to type the parameters — I'll repurpose this issue for that.

And thanks for the offer to help and for your suggested fix. I'll have to think about bit more this and how we can best solve it.

@pdubroy pdubroy reopened this May 4, 2023
@pdubroy pdubroy changed the title Add type-safe semantic operations and attributes Add ability to type operation parameters May 4, 2023
@TheOnlyTails
Copy link
Author

While parameters are one big missing thing about the types, I'm talking mostly about the typing of actions being called. For example:

semantics.addOperation<Expression>("eval()", {
  Expression_parens: (_, expr, _1) => expr.eval() // currently returns any, should return Expression
})

While this still compiles because of the any, it's still vulnerable to misspellings and confusing one operation with another.

My proposed solution would allow typescript to infer types of declared operations/attributes directly from their declarations, allowing for safer, less error-prone code.

@pdubroy pdubroy changed the title Add ability to type operation parameters Add ability to type operation parameters & return values May 5, 2023
@pdubroy
Copy link
Contributor

pdubroy commented May 5, 2023

Oh, I see! Yes, that's a very good point. I was missing the distinction between the return values of the semantic actions themselves (which ultimately get used as the result of the operation) and the result of the operation.

@TheOnlyTails
Copy link
Author

TheOnlyTails commented May 5, 2023

I've started working on this issue myself, diving head-first into the codebase, and so far these are my plans:
The most important goal is to avoid breaking the old way of defining semantic ops/attrs, so this will mostly make changes to generated types.

Here are some prototypes I've came up with:

// a utility types - gets rid of the [index: string]: any on the original Node type
type NoIndexNode = {
  [K in keyof Node as string extends K ? never : K]: Node[K];
};

// a custom node type that adds all of the attributes and operations to the original Node, given action dicts for each.
export type TestNode<Ops, Attrs> = NoIndexNode & {
  [op in keyof Ops]: Ops[op] extends (
    ...args: infer Args
  ) => TestActionDict<infer Ret>
    ? (...args: Args) => Ret
    : never;
} & {
  [attr in keyof Attrs]: Attrs[attr] extends TestActionDict<infer AttrType>
    ? AttrType
    : never;
};

export interface TestGrammar<
  Ops = { [index: string]: (...any) => AuraActionDict<any> },
  Attrs = { [index: string]: AuraActionDict<any> }
> extends Grammar<Ops, Attrs> {
  createSemantics(operations?: Ops, attributes?: Attrs): TestSemantics;
  extendSemantics(
    superSemantics: TestSemantics,
    operations?: Ops,
    attributes?: Attrs
  ): TestSemantics;
}

The main concept here is passing around the ops and attrs objects from createSemantics, then constructing a Node type from it and re-using it in the action dictionary.

To preserve back-compat, we could check if both the attrs and ops objects passed to createSemantics are undefined, and then use the regular Node to allow for the old system to still be used.

@TheOnlyTails
Copy link
Author

I've been working on and off on this for a while, and unfortunately, I think there's only 3 options for this:

  1. Keeping things the way they are now, which is not ideal;
  2. We could provide back-compat for the old system, but that would make the DX much worse for users of both the new and old systems;
  3. Completely remove the old system, and only allow using the new one, which would be a major breaking change.

IMHO, if this is still something that's worth doing, only option no. 3 is worth it.
Trying to keep compatibility with the current way of doing things would be too much work to keep maintaining a flawed system that essentially opts-out of typechecking.

@pdubroy
Copy link
Contributor

pdubroy commented Aug 1, 2023

@TheOnlyTails Sorry for the late reply — thanks very much for looking into this! I'd be happy to try to fix this for the next major release. Before committing to anything, I need to find time to think about this more deeply, and would like to get some other eyes on it too.

@mrshll
Copy link

mrshll commented Dec 27, 2023

I'm interested in this as well. Defining the types on the operation's parameters seems necessary in order to use them, unless there's some other workaround to shim the type into this.args

@rrthomas
Copy link

I've done some work on this without looking at either this issue (oops!) or the Ohm source code, but just patching its auto-generated types in my project. I think it might be useful in any case as an example of how far one can get without dramatic surgery to Ohm itself, and some modest changes to how it's used in TypeScript.

In outline, I made the following changes:

  1. Node, IterationNode and NonterminalNode are made generic on a type Operations, which is the type of the operations offered by the semantics.
  2. I add a type ThisNode<Operations, Args>, which is used specifically for the this argument to semantic actions, capturing the fact that only this arguments have the args member. The Args type parameter is of course the type of the arguments object. I then changed the type of each action's this argument to ThisNode.
  3. I added type parameters to the generated FooSemantics type/interface, one for each new Node type, and one for the Operations type. If Ohm's type declarations were changed, only Operations would be needed.
  4. I added similar type parameters to the functions defined in the exported FooGrammar interface.

Here's an example of the resulting type declaration file, with most of the actions elided):

// AUTOGENERATED FILE
// This file was generated from ursa.ohm by `ohm generateBundles`.

import {
  BaseActionDict,
  Grammar,
  Node as NodeBase,
  NonterminalNode,
  NonterminalNode as NonterminalNodeBase,
  Semantics,
  TerminalNode
} from 'ohm-js';

interface NodeI<Operations> extends NodeBase {
  child(idx: number): Node<Operations>;
  children: Node<Operations>[];
  asIteration(): IterationNode<Operations>;
}

export type Node<Operations> = NodeI<Operations> & Operations;

export type IterationNode<Operations> = Node<Operations>;

export type NonterminalNode<Operations> = Node<Operations>;

export type ThisNode<Args, Operations> = Node<Operations> & {
  // Only the `this` of semantics action routines has this member.
  args: Args;
};

export interface UrsaActionDict<T, Node, NonterminalNode, IterationNode, ThisNode> extends BaseActionDict<T> {
  _terminal?: (this: ThisNode) => T;
  _nonterminal?: (this: ThisNode, ...children: NonterminalNode[]) => T;
  _iter?: (this: ThisNode, ...children: NonterminalNode[]) => T;
  Sequence?: (this: ThisNode, arg0: NonterminalNode, arg1: NonterminalNode) => T;
  
}

interface UrsaSemanticsI<Node, NonterminalNode, IterationNode, ThisNode, Operations> extends Semantics {
  (match: MatchResult): Operations;
  addOperation<T>(name: string, actionDict: UrsaActionDict<T, Node, NonterminalNode, IterationNode, ThisNode>): this;
  extendOperation<T>(name: string, actionDict: UrsaActionDict<T, Node, NonterminalNode, IterationNode, ThisNode>): this;
  addAttribute<T>(name: string, actionDict: UrsaActionDict<T, Node, NonterminalNode, IterationNode, ThisNode>): this;
  extendAttribute<T>(name: string, actionDict: UrsaActionDict<T, Node, NonterminalNode, IterationNode, ThisNode>): this;
}
export type UrsaSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations> = UrsaSemanticsI<Node, NonterminalNode, IterationNode, ThisNode, Operations> & Operations;

export interface UrsaGrammar extends Grammar {
  createSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations>(): UrsaSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations>;
  extendSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations>(superSemantics: UrsaSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations>): UrsaSemantics<Node, NonterminalNode, IterationNode, ThisNode, Operations>;
}

declare const grammar: UrsaGrammar;
export default grammar;

With these types, I was able to remove all the type assertions in my code. A typical usage looks like this:

import grammar, {
  Node, NonterminalNode, IterationNode, ThisNode,
} from '../grammar/ursa.ohm-bundle.js'


type FormatterOperations = {
  fmt(a: FormatterArgs): Span
  hfmt(a: FormatterArgs): Span
}

type FormatterArgs = {
  maxWidth: number
  indentString: string
  simpleExpDepth: number
}

type FormatterNode = Node<FormatterOperations>
type FormatterNonterminalNode = NonterminalNode<FormatterOperations>
type FormatterIterationNode = IterationNode<FormatterOperations>
type FormatterThisNode = ThisNode<{a: FormatterArgs}, FormatterOperations>

export const semantics = grammar.createSemantics<FormatterNode, FormatterNonterminalNode, FormatterIterationNode, FormatterThisNode, FormatterOperations>()

One obviously debatable choice I've made here is to group my semantic operations in families which share the same arguments. That seems implicit in the way that Ohm works, and makes sense for me: in my project, I have a group of compiler actions and a group of formatter actions.

The examples above are taken from https://github.com/ursalang/ursa/tree/main/src/

Thanks very much for Ohm, it's amazing! It has been fun and quick to use. Nevertheless, I look forward to a new version with more thorough typing: with the above changes I was able to remove hundreds of type assertions, non-null assertions, and comments to myself where types were too lax. The Ursa compiler is now about ⅓ shorter, and much easier to read, so even with this rather heavy-handed low-tech intervention of patch Ohm's output, it feels worth it.

@rrthomas
Copy link

P.S., a corollary to my willingness to hack around with Ohm's output is that I'd be very happy with a breaking change to the API. Ohm is nice and stable as-is, so there's no pressure on me to upgrade at any particular moment, while a new API would be a great improvement on my current hack, and I wouldn't anticipate moving to it involving much more than rewriting "impedance matching" code of the sort I've exhibited above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants