Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: Parameter type inference from function body #17715

Closed
wants to merge 20 commits into from

Conversation

masaeedu
Copy link
Contributor

@masaeedu masaeedu commented Aug 10, 2017

This PR is not intended to be merged as-is, it is an attempt at demonstrating the feasibility of #15114, and is a request for revisiting this topic.

The change facilitates inference of parameter types from usage as arguments within the function body. Some motivating examples and the inferred types are shown below:

// CASE 1
function foo(s) {
    Math.sqrt(s)
}
// => function foo(s: number): void

// CASE 2
declare function swapNumberString(n: string): number;
declare function swapNumberString(n: number): string;

function subs(s) {
  return swapNumberString(s);
}
// => function subs(s: string): number

// NOTE: Still broken, needs to deal with overloads. Should have been inferred as:
// => (s: string) => number & (s: number) => string

// CASE 3
function f(x: number){
   return x;
}

function g(x){ return f(x); };
// => function g(x: number): number

@masaeedu masaeedu changed the title DISCUSSION: Parameter type inference from function body Discussion: Parameter type inference from function body Aug 10, 2017
.map((arg, i) => ({ arg, symbol: getSymbolAtLocation(arg), i }))
.filter(({ symbol }) => symbol && symbol.valueDeclaration === parameter)
if (!usages.length) return
const funcSymbol = getSymbolAtLocation(invocation.expression)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to find robust mechanism for resolving the node corresponding to the invoked function to a function signature. Right now things like interface Foo{ (bar: Bar): Baz } can mess it up.

@masaeedu
Copy link
Contributor Author

Down to one failed testcase now.

The mechanism used to match up arguments and parameters here is very brittle, and can easily be broken by e.g. rest parameters. For purposes of the POC we deal with this by simply discarding such a usage site.
@masaeedu
Copy link
Contributor Author

All tests should be passing now. In terms of implementation details, I still need to figure out how to robustly:

  • Retrieve the parameter symbol associated with a given argument in a CallExpression
  • Obtain a workable signature from things like interface Foo { (bar: Bar): Baz } so I can retrieve the parameters associated with arguments

That aside, I think it would be good to start looking at examples of tricky cases where either the implementation or the proposed semantics blow up.

if (!func.body || isRestParameter(parameter)) return;

const types = checkAndAggregateParameterExpressionTypes(parameter);
return types ? getWidenedType(getIntersectionType(types)) : undefined;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getIntersectionType cannot handle flow sensitive typing. For example,

function test(a) {
  if (someCond) roundNumber(a)
  else trimString(a)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HerringtonDarkholme Could you provide a concrete example of this? For which reference is someCond affecting the type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HerringtonDarkholme I understand what you mean now; in the case you highlighted a should be inferred as number | string instead of number & string. number & string is an excessively conservative inference, but it is a sound one: number & string is assignable to number | string.

Some options are to bail and infer any for parameters used in branched code, keep the existing behavior and expect the user to manually supply typings when they realize number & string is unsatisfiable, or to actually analyze the flow graph and generate the appropriate union type.

if (!usages.length) return;
const funcSymbol = getSymbolAtLocation(invocation.expression);
if (!funcSymbol || !isFunctionLike(funcSymbol.valueDeclaration)) return;
const sig = getSignatureFromDeclaration(funcSymbol.valueDeclaration);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with recursive function? I guess it need some guard statement.

Copy link
Contributor Author

@masaeedu masaeedu Aug 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTypeOfParameter calls getTypeOfSymbol, which internally has a stack checking for circularity and ejecting with "unknown". I believe we can rely on this, but I need to add tests.

const funcSymbol = getSymbolAtLocation(invocation.expression);
if (!funcSymbol || !isFunctionLike(funcSymbol.valueDeclaration)) return;
const sig = getSignatureFromDeclaration(funcSymbol.valueDeclaration);
const parameterTypes = sig.parameters.map(getTypeOfParameter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the parameterType is a generic type parameter? Should it be propagated to the inferred function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably it should be ensured that the function currently under inference does not already have an explicit type parameter list, and a type parameter to add polymorphism wrt the parameter under inference. If there is already a type parameter list we should just bail.

function checkAndAggregateParameterExpressionTypes(parameter: ParameterDeclaration): Type[] {
const func = <FunctionLikeDeclaration>parameter.parent;
const usageTypes: Type[] = [];
forEachInvocation(<Block>func.body, invocation => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearly, only collecting invocation isn't enough. But I wonder how other expressions can be handled. For example, how arg.assigment = 123 is handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HerringtonDarkholme I've put out some initial thoughts in #15114 (comment). We can exploit co and contravariance to get sensible rules here. If something assigns to a parameter, the sensible thing is for the parameter to be inferred as having a supertype of said assignment. Unfortunately TypeScript can't represent this.

This means we should restrict ourselves to inference in covariant usages of the parameter variable, i.e. wherever it is used as a source of information (assignment of the parameter to well-typed reference, invocation of the parameter with well-typed argument list, accessing properties on parameter, etc.)

Copy link
Contributor

@HerringtonDarkholme HerringtonDarkholme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pull request!

But I feel the quality of parameter inference doesn't match other inference already in TS.

However, it's another problem whether implementing full inference is worthy

@masaeedu
Copy link
Contributor Author

@HerringtonDarkholme Thanks for the feedback! This is not intended to be a mergeable implementation of parameter inference; the goal is to demonstrate that it is possible to sensibly infer parameter types recursively without breaking everything in the compiler, and that it's possible to do so incrementally, reverting to noImplicitAny when a sensible inference cannot be made.

This is useful because when converting a new project to TypeScript, you often end up either disabling noImplicitAny and getting any everywhere, or manually propagating types up the call graph by adding parameter types to every function expression. Some kind of middle-ground would be useful, especially if this allows you to automatically generate explicit types for top-level APIs more accurately.

I covered only invocation expressions here because these seemed easiest to implement for a POC (and mirror the example from #15114), but the same concept extends to assignment from parameter, operator usage, etc.

@mhegazy
Copy link
Contributor

mhegazy commented Nov 8, 2017

I have recently worked on something similar to this in #14786. I started off with inferring from function body, but putting it to usage, changed my mind. most functions do not just use the parameter as is, they usually look at one ore more properties from it.. the result of the inference from the function body is always more of a constraint than a type really.. you end up with things like { kind: number } instead of Node, or { length: number } instead of Array which correct.. but not what a human would have thought about.. it gets complicated as the type gets more complex, and then you have these anonymous types that really mean nothing to the developer..

The other problem is this disables checking on the parameter all together. true that an implicit any inference does that as well, but at least you have a way to flag these as errors in --noImplicitAny.

I think the right place for this work is quick fix/ refactoring to help the user migrate to TS. The bar is low here for accuracy, since the user is doing an explicit action and will be reviewing the result anyways. Feel free to look at #14786, and contribute changes.

@mhegazy mhegazy closed this Nov 8, 2017
@masaeedu
Copy link
Contributor Author

masaeedu commented Nov 8, 2017

@mhegazy The overall goal of this and #17428 is to have the type <T extends { length }>(a: T) => T["length"] ascribed to a => a.length. This type means nothing to the developer when they really meant to talk about Array#length, but this is fine: the inferred type is sufficiently polymorphic that applying it to [1, 2, 3] will correctly produce number, and incorrectly applying it will produce an error.

Inferring things in this way ensures the function demands no more specificity in its parameters, and promises no less specificity in its return type, than what its implementation dictates.

It is true that inferring from the body will defer discovery of any type errors in the function implementation to the point where the function is invoked, but this is a worthwhile tradeoff for the significant decrease in verbosity and the increased ease of converting existing codebases to TypeScript.

As an example, if you do inference in this manner, it becomes more feasible for the typechecker to interact directly with plain JS modules that don't have any explicit typings.

@mhegazy
Copy link
Contributor

mhegazy commented Nov 8, 2017

#17428 is a breaking change.. a big one mind you. inferring constraints from the body of functions makes it less so, but still not completely. I would like to see a full proposal for this if you have one.

@masaeedu
Copy link
Contributor Author

masaeedu commented Nov 8, 2017

@mhegazy I'm going to be on vacation in a bit, and will spend some time to see if can come up with a proposal/reference implementation. That way you'll have a concrete list of thousands of broken testcases to point to when you reject the proposal 😉

@mhegazy
Copy link
Contributor

mhegazy commented Nov 8, 2017

That way you'll have a concrete list of thousands of broken testcases to point to when you reject the proposal

This is not the intention of my earlier comment.. I think if we are going to consider a major change, we should have the full picture, rather than having it spread on multiple smaller issues. In general this has a better chance of a positive outcome.

It is also important to acknowledge the proposal in #17428 is a breaking change, and a viable proposal needs to consider how it can be added/phased in with limited disruption to the system. Assuming we can make a fundamental change of behavior like this is with no mitigation is not realistic.. we have experience with similar changes with limiting any type evolution to under --noImplicitAny, and we have experimented with doing out-of-band inference in #14786. So having a proposal with this in mind would be very helpful.

@masaeedu
Copy link
Contributor Author

masaeedu commented Nov 8, 2017

@mhegazy I totally agree with you regarding the necessity of a proper proposal, my last comment was in jest. Exactly how much of a breaking change it would be, and whether it is possible to come up with a compromise that doesn't break 90% of existing TypeScript code is something I'm interested in myself.

@microsoft microsoft locked and limited conversation to collaborators Jun 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants