Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message for function application on a datatype #442

Open
chtenb opened this issue Jan 19, 2024 · 12 comments
Open

Improve error message for function application on a datatype #442

chtenb opened this issue Jan 19, 2024 · 12 comments

Comments

@chtenb
Copy link
Contributor

chtenb commented Jan 19, 2024

Minimal (Contrived) Reproduction

ref struct model
  value : int

fun bar(model : model) : model
  // model parameter passed in for other reasons
  // Now constructor cannot be called
  model(1)

Error message

repro.kk(10, 9): error: types do not match
 context      :   model(1)
 term         :         1
 inferred type: int
 expected type: model

Expected behavior
I believe there is no ambiguity in the code above, so I think it should be able to compile.

@TimWhiting
Copy link
Collaborator

TimWhiting commented Jan 19, 2024

Constructors are PascalCase. Try Model(1). The type is lower case, but creates a constructor with PascalCase.

However, in general, the type inference does require lexically scoped variables to unambiguously resolve. This makes it easier to look at a variable and see what the type is by looking at it's definition or parameter type. It gets weird if you use a variable and it looks to have a defined type in the parameter's type signature, but the usage in the body doesn't match that expectation.

Implicit parameters participate in higher order overloading, and thus are allowed to be qualified. The syntax ?show is actually a @implicit/show. So when you refer to show in the body, there is no full match in the lexical scope and it gets resolved as an overloaded identifier. Only implicits are allowed to have qualifiers though.

@chtenb
Copy link
Contributor Author

chtenb commented Jan 19, 2024

Whoops, looks like I simplified the repro one step too far. This is the correct example:

ref struct model
  value : int

fun model(b : bool) : model
  Model(if b then 1 else 0)

fun bar(model : model) : model
  // model parameter passed in for other reasons
  // Now constructor cannot be called
  model(True)

Error:

repro.kk(10, 9): error: types do not match
 context      :   model(True)
 term         :         True
 inferred type: bool
 expected type: model

@TimWhiting
Copy link
Collaborator

TimWhiting commented Jan 19, 2024

This is actually a perfect example why the current behavior is the correct one.

What you wrote looks exactly like you are trying to use the copy function of model:
https://koka-lang.github.io/koka/doc/book.html#sec-copying

Users would be highly confused when reading bar, if it wasn't using the copy function.

The recommended solution is to qualify your model function:

fun bool/model(b : bool) : model
  Model(if b then 1 else 0)

fun bar(model : model) : model
  // model parameter passed in for other reasons
  bool/model(True)

You could qualify it with new or any number of descriptive ways to make it clear that it creates a new model based off of a boolean.

My preferred solution though is to keep variable names short. i.e. m for your model parameter. Here is my reasoning: when reading a function you care more about understanding the functionality of the function. i.e. more about what it does than what it uses.

  1. You can already clearly see what it uses in the function signature and their types, no need to repeat long names throughout. Especially if it is a short function.
  2. Function names should be descriptive however so you understand functionality.
  3. If your function is large, you might lose track of a bunch of short names, in that case, I would use longer more descriptive names or split the function into smaller functions, preferring splitting it whenever reasonable.
  4. Additionally, this practice encourages comments in natural language to explain rather than cluttering up your code with long variable names (which have to be repeated multiple times throughout the body).

Again, this is just my opinion.

Regardless on your opinion about naming, I don't think this is a bug in the current implementation, at least not how Daan and I designed it. But I do recognize that it could use more documentation on the reasons why lexical names do follow regular shadowing principles, but implicit parameters and other identifiers do not.

@chtenb
Copy link
Contributor Author

chtenb commented Jan 19, 2024

To be honest, I didn't figure out from the error message it was the parameter name that was the problem, until I trimmed it al the way down to this minimal repro. Moreovier, previous versions of Koka were very lenient with resolving identifiers based on types, and you could seemingly overload an identifier in any number of ways without causing problems. These two reasons made it look like an unintended regression to me.

@TimWhiting
Copy link
Collaborator

TimWhiting commented Jan 19, 2024

I would actually say that Koka is more lenient with resolving identifiers based on types currently, based on what I know Daan did to support implicits. I don't think it ever supported overloading shadowed locals, and if it did then that was unintentional. Daan has personally told me that he does not want overloaded locals when it came up as we designed implicits because implicits had to be an exception to the rule. The error could probably be improved though like you said. In particular the context should also refer to where the identifier was defined, and I'm not sure why it thinks the expected type should be model.

@chtenb
Copy link
Contributor Author

chtenb commented Jan 20, 2024

The rules about shadowing make sense to me in isolation.
But the combination with existing naming conventions makes it akward in some cases.

  1. In larger functions, using single-letter variable names can hinder readability and clarity. This issue arises especially when multiple types share the same initial letter.
    In such cases, it seems intuitive to name a variable after its type, particularly when there's no unique characteristic that warrants a different name. In other languages and situations I've often seen people work around naming limitations by using names like mdl, the_model, model_, etc.
    Ideally, a somewhat elegant standard naming convention should be established for these situations.
  2. In some programming languages, this conflict is mitigated by distinct naming conventions for variables, types, and functions.
    For instance, in C#, functions and types usually begin with an uppercase letter, while local variables start with lowercase.
    In contrast, Koka employs lowercase for all these elements and typically names 'smart constructors'/'factory functions' after the type itself.

The combination of these two things makes it so that this particular naming conflict, i.e. local variables and functions sharing the same name, arises quite naturally.
I've been trying to come up with a naming convention for locals or factory functions as a solution to avoid this issue, but I don't have one so far.
I've currently resorted to naming my factory functions like model' until I find something better :)

@TimWhiting
Copy link
Collaborator

I'm curious why you need a factory function that is named the same as the data type? You can use default expressions on the data type definition if there is some initialization that needs to happen for the data members. If it is for initializing from a different type, or non-standard set of parameters, I would expect the locally qualified identifier solution to be more what you want than renaming to model', unless you are doing that for brevity.

@chtenb
Copy link
Contributor Author

chtenb commented Jan 20, 2024

That's fair. The following works fine and is readable too

ref struct model
  value : int

fun bool/model(b : bool) : model
  Model(if b then 1 else 0)

fun bar(model : model) : model
  // model parameter passed in for other reasons
  // Now constructor cannot be called
  bool/model(True)

You can use default expressions on the data type definition if there is some initialization that needs to happen for the data members.

What do you mean by that?

@TimWhiting
Copy link
Collaborator

TimWhiting commented Jan 20, 2024

https://koka-lang.github.io/koka/doc/book.html#sec-structs

The example shows initializing realname with name, but it is just any arbitrary expression (provided it doesn't use effects I think? Not sure about that)

@chtenb
Copy link
Contributor Author

chtenb commented Jan 20, 2024

Ah, I had completely overlooked that detail (the .. = name part) when reading it. Interesting, that does cover a lot of basic usecases of factory functions indeed.

@daanx
Copy link
Member

daanx commented Jan 21, 2024

I just skimmed the discussion so apologies if I am rehashing :-), but:

module foo

ref struct model
  value : int

fun model(b : bool) : model
  Model(if b then 1 else 0)

fun bar(model : model) : model
  // model parameter passed in for other reasons
  // this fails
  model(True)

This is as intended; if we would resolve model taking types and globals into account we would violate the regular scoping rules where we would reach over the model parameter into the global scope. I believe we want an error here in particular if something in the global scope could accidentally make this check.

The solution is indeed to qualify explicitly, either as foo/model or foo/Model for the global scope, or renaming the local parameter to not clash, or locally qualifying the global model factory function. As remarked above, we can also give default values to fields (either in a struct or constructor), like:

struct model 
  value : int = 0

and use Model(1) or Model() to construct them.

Finally, with implicit parameters, the local name is locally qualified ,where an implicit parameter ?show is really @implicit/show. So, when we write show inside such function, the compiler can consider global definitions as well since no local has the exact name -- we need to write ?show to be explicitly referring to the local. Hope this helps. Thanks.

btw. This is still experimental so it may still need tuning or changing rules so any examples that help evaluate the design are very welcome.

@daanx daanx closed this as completed Jan 21, 2024
@TimWhiting
Copy link
Collaborator

@daanx I was leaving this open, particularly because the error message that @chtenb ran into is particularly unintuitive.

@TimWhiting TimWhiting reopened this Jan 21, 2024
@TimWhiting TimWhiting changed the title Parameter name prevents ability to call function with same name Improve error message for type errors for function application on a datatype Jan 26, 2024
@TimWhiting TimWhiting changed the title Improve error message for type errors for function application on a datatype Improve error message for function application on a datatype Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants