Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile-time dependencies #162

Open
marcandre opened this issue Oct 17, 2022 · 8 comments
Open

Compile-time dependencies #162

marcandre opened this issue Oct 17, 2022 · 8 comments
Labels
help wanted Extra attention is needed performance (too slow) Usage of TypeCheck leading to unusably slow compile times or running time

Comments

@marcandre
Copy link
Contributor

marcandre commented Oct 17, 2022

The error in #160 made me suspect that type_check might introduce compile-time dependencies just about everywhere.

To double check this:

# empty_mix/example.ex
defmodule EmptyMix.Example do
  use TypeCheck, enable_runtime_checks: Mix.env() != :prod
  @type! t :: :ok
end

# empty_mix.ex
defmodule EmptyMix do
  defstruct [:name]
  use TypeCheck, enable_runtime_checks: Mix.env() != :prod

  @spec! hello(%{String.t() => any}) :: EmptyMix.Example.t()
  def hello(_struct), do: :ok
end

This compiles, but also introduces a compile-time dependency that should not be there:

$ mix xref graph
lib/empty_mix.ex
└── lib/empty_mix/example.ex (compile)
lib/empty_mix/example.ex

I understand that type_check needs to generate code at compile time to check that inputs and outputs corresponds to the specified specs, but hopefully there's way for that generation to not rely on the actual spec in EmptyMix.Example and only generate calls to that module. This would only introduce a runtime dependency on Example and not a compile-time one.

Compile-time dependencies can easily lead to many transitive dependencies and make it necessary to recompile most of a project whenever just about anything changes (see e.g. https://dashbit.co/blog/speeding-up-re-compilation-of-elixir-projects)

@dvic
Copy link
Contributor

dvic commented Oct 17, 2022

We had a similar problem with typed_ecto_schema. The fix there was to Macro.escape the unquote parts where types were inserted into the AST. The fix was merged in bamorim/typed_ecto_schema#18 (details are in bamorim/typed_ecto_schema#15).

@Qqwy
Copy link
Owner

Qqwy commented Oct 26, 2022

I am not sure whether this situation can be improved by the library.
Or at least not without reducing runtime performance.

It is correct that a module compile-time-depends on the module(s) defining the types they use, because the checks are inserted as local code snippets at compile-time as well. The checks are inserted as local code snippets, which allows optimization between the code in the checks and the code in the function bodies.
But as such, this is not an 'accidental' compile-time dependency, but really a required one.

Ways to minimize recompilation today

The easiest and most flexible approach is to move the type(s) you use in multiple places to a single module which effectively only contains those core types. Since the only reason for change of this module is when the types themselves change, modules depending on this module will not frequently need recompilation.

Alternatively you might get some mileage --but with clear trade-offs-- from using the type overrides (then you only depend on the module with the overrides rather than the original module, however you will need to keep the types between the two in sync manually) or the automatically-extracted types (similarily you won't depend on the original module any more, but you will not be able to use the more advanced features of TypeCheck).
As such, I recommend just extracting the types to a dedicated module over these approaches.

Future

Indeed it is at least theoretically possible to move the type checking code to the original module that defines the type and re-use it.
Advantage: a big increase in compile speed. Disadvantage: a non-neglegible runtime overhead, and less possibilities for compile-time optimization.

One stepping stone for allowing this feature is #132, which already is quite challenging on its own 😅.

It definitely is something that would be a great improvement to the library. But it also is a large amount of work, and as such not the first thing on the list of priorities right now. (First, the current open bugs and quirks in today's usage should be fixed, and error reporting should be made configurable.)

@Qqwy Qqwy added help wanted Extra attention is needed performance (too slow) Usage of TypeCheck leading to unusably slow compile times or running time labels Oct 26, 2022
@Qqwy Qqwy added this to the Some future version milestone Oct 26, 2022
@marcandre
Copy link
Contributor Author

marcandre commented Oct 26, 2022

It is correct that a module compile-time-depends on the module(s) defining the types they use, because the checks are inserted as local code snippets at compile-time as well. The checks are inserted as local code snippets, which allows optimization between the code in the checks and the code in the function bodies. But as such, this is not an 'accidental' compile-time dependency, but really a required one.

I am not sure I follow.

Excuse my ignorance, I don't know exactly how type_check manages to do what it does, but I assume that it somehow generates code like:

def check_output_signature_for_hello(result) do
  if conforms(EmptyMix.Example, :t, result), do: :ok, else: raise SomeError
  # or 
  if EmptyMix.Example.t_check?(result), do: :ok, else: raise SomeError
end

This code introduces a runtime dependency and not a compile-time dependency.

As long as type_check does not need to call functions of EmptyMix.Example when generating the code, but only wants to copy-paste it in the body of some functions definitions, it should be totally doable.

Ways to minimize recompilation today

The easiest and most flexible approach is to move the type(s) you use in multiple places to a single module which effectively only contains those core types. Since the only reason for change of this module is when the types themselves change, modules depending on this module will not frequently need recompilation.

I'm afraid this is not a viable solution for us. We've had to do this for behaviours which wasn't so bad, but that won't be necessary in the next version of Elixir.

It definitely is something that would be a great improvement to the library. But it also is a large amount of work, and as such not the first thing on the list of priorities right now. (First, the current open bugs and quirks in today's usage should be fixed, and error reporting should be made configurable.)

I understand. If you remember, please ping me if ever you get to it. I sadly won't be looking into using type_check until then as I consider this a showstopper. At least my bug reports will stop for now 😅.

@dvic
Copy link
Contributor

dvic commented Oct 27, 2022

I am not sure whether this situation can be improved by the library.
Or at least not without reducing runtime performance.

@Qqwy It depends, does the @type! or spec! macro eventually result in some call to a function (not macro, because that doesn't introduce compile-time deps) where the module is passed? If so, then this is 'easily' fixed by replace it with the AST of the module reference itself, in typed_ecto_schema this fix was something like:

image

I sadly won't be looking into using type_check until then as I consider this a showstopper.

@marcandre I thought this would also be a showstopper for us, but in practice the kind of compile-time dependencies introduced are not that bad, as in, in our case the referenced modules don't contain many (if any) runtime dependencies, and therefore we don't get any additional compile-connected dependencies, which are really bad.

@Qqwy
Copy link
Owner

Qqwy commented Oct 27, 2022

Excuse my ignorance, I don't know exactly how type_check manages to do what it does, but I assume that it somehow generates code like:

def check_output_signature_for_hello(result) do
  if conforms(EmptyMix.Example, :t, result), do: :ok, else: raise SomeError
  # or 
  if EmptyMix.Example.t_check?(result), do: :ok, else: raise SomeError
end

This code introduces a runtime dependency and not a compile-time dependency.

No worries!
Some info about this is included in the talk I gave at ElixirConf.EU but the video of that talk is not yet online.
The documentation could/should be improved to be more clear about this. 😊


Currently, the code that is generated, contains the check inline (And as such, it needs to call functions in the original module at compile-time). This is also true for conforms and co. (and the difference between them and the dynamic_conforms variants).

If you have, for instance, a type

@type! foo :: integer()

And (possibly in another module) a function with a spec:

@spec! compute_a_square(foo) :: foo
def compute_a_square(val) do
  val * val
end

Then at compile-time:

  1. We evaluate TheModuleThatDefinesTheType.foo() to obtain a datastructure representing the type. (This is why the compile-time dependency is necessary)
  2. This datastructure is passed to TypeCheck.Protocols.ToCheck.to_check. This returns the Elixir AST to check for this type.
  3. This AST is placed inline in your function. (When using @spec!, this is done by wrapping the original function using defoverridable.)

In other words, the code that is generated for this example function will become:

def compute_a_square(val) do
  case val do
    x when is_integer(x) ->
      result = val * val
      case result do
        y when is_integer(y) ->
          y
        _ ->
        raise TypeCheck.TypeError "compute_a_square did not return a value of type `foo` etc. etc. etc."
    _ ->
      raise TypeCheck.TypeError "compute_a_square expected first parameter to be of type `foo` etc. etc. etc."
end

(This is still slightly simplified. If you want to see the exact output, you can write use TypeCheck, debug: true and it will be logged at compile-time.)

The advantage here is that the BEAM is smart enough to see that e.g. we already check whether val is an integer before doing the multiplication, so instead of compiling lhs * rhs down to (the low-level bytecode equivalent of):

if is_integer(lhs) && is_integer(rhs) do
    low_level_integer_multiplication(lhs, rhs)
else
    raise ArithmeticError
end

it is compiled down to only

low_level_integer_multiplication(lhs, rhs)

directly.

@Qqwy
Copy link
Owner

Qqwy commented Oct 27, 2022

This trickery would not be necessary if Erlang were to do cross-module code inlining. But currently Erlang does not.

Moving to an approach where we do not inline the checks but instead generate code very similar in style to what you shared:

def compute_a_square(val) do
  RemoteModule.foo_check(val)
  result = val * val
  RemoteModule.foo_check(result)
  result
end

is something that is possible, but:

  • while improving compile-times, it has very clear runtime overhead.
  • it would require quite some changes to the guts of the library.

What might of course make sense, for instance, is to use the 'faster compile-time but slower runtime' variant for dev and test environments, and the 'slower compile-time but faster runtime' variant for production environments. (If checks are turned on at all for production, of course.)

I'm afraid this is not a viable solution for us. We've had to do this for behaviours which wasn't so bad, but that won't be necessary in the next version of Elixir.
[...]
I understand. If you remember, please ping me if ever you get to it. I sadly won't be looking into using type_check until then as I consider this a showstopper. At least my bug reports will stop for now sweat_smile.

Thanks a lot for sharing your concerns! Dialogue like this is very useful to improve the library. It should become something that is useful to everyone (or at least a large group) in the Elixir community.
Your bug reports are very welcome and helpful as well 😜 .
I totally understand that the current approach can be a show-stopper. If you have a large project, the compile-times can add up.

I will definitely ping you when we get around to implementing this feature! 💚

@marcandre
Copy link
Contributor Author

Thank you for the detailed explanation, makes it clear what is going on. ❤️

I sincerely do hope that the implementation will be changed to avoid this, even if the resulting code is slower. My guess is that the performance difference would be negligible, but I might be wrong; without measuring things directly it's very hard to know.

@dvic
Copy link
Contributor

dvic commented Oct 28, 2022

@Qqwy That indeed makes it very clear what's going on, thanks for that!

One possible approach (but it would probably require quite a major overhaul) is to use a separate mix compiler, like Domo does. Then these TheModuleThatDefinesTheType.foo() can be TheModuleThatDefinesTheType.TypeCheck.foo() and the compile-time dependency on TheModuleThatDefinesTheType is gone (for the mix compiler). Of course, you still have the dependency in the 'type_check' compiler but this does not involve compile-connected dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance (too slow) Usage of TypeCheck leading to unusably slow compile times or running time
Projects
None yet
Development

No branches or pull requests

3 participants