Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Representation #16

Open
OvermindDL1 opened this issue Jul 5, 2017 · 7 comments
Open

Internal Representation #16

OvermindDL1 opened this issue Jul 5, 2017 · 7 comments

Comments

@OvermindDL1
Copy link

Right now the internal representation of prod/sum types are structs, however these are not as efficiently matched in the BEAM as raw tuples are (especially since OTP 20 made matching on an atom in the first position of a tuple significantly faster). Have you thought about having the internal representation of sum/prod types be records (I.E. tagged tuples), so something like:

defmodule Light do
  defsum do
    defdata Red    :: none()
    defdata Yellow :: none()
    defdata Green  :: none()
    defdata Custom do
      red   :: non_neg_integer()
      green :: non_neg_integer()
      blue  :: non_neg_integer()
    end
  end
end

Would become basically:

:Red
:Yellow
:Green
{:Custom, red, green, blue}

Or maybe even (slightly slower and more GC work but barely):

{:Red}
{:Yellow}
{:Green}
{:Custom, red, green, blue}

Or perhaps an option on the defsum/defdata to choose the internal representation?

A big bonus with this format is that you could then create guards, so you could have something like:

def myfunc(light) when Light.is(light), do: :blah

def is_specific_light(light) when Light.is(light, [Red, Yellow, Green]), do: true
def is_specific_light(light) when Light.is(light, Custom), do: false

And other such things? Currently it is not possible to do dynamic matching like that on a map/struct without controlling both the matcher and the guard (the best you can do is matching if it is a 'single' type, not one of a set of types, if you have the internal representation as a map/struct), since you can use elem/2 (and tuple_size/1) in a guard to test (make sure to test non-tuple representations first then test via elem as later guards will not run if elem is used when it is not a tuple, or it does not matter if all internal representations are as a tuple. The BEAM is tuned for speed on matching tuples, not maps, and that includes what can be tested in guards as well. :-)

@OvermindDL1
Copy link
Author

More detailed control would be useful in cases when each 'branch' of a sum is unique based on what it contains as well, like:

defmodule Mine do
  defsum do
    defdata Float  :: float()
    defdata Int    :: non_neg_integer()
    defdata String :: String.t
  end
end

Could just be the raw types of 6.28, 42, and "Hello world" and so forth without any kind of wrapper at all. Guards for testing things like Mine.is(Float, f) would be doable for many types (depending on what they are).

As well as for just plain defdata so defdata non_neg_integer() would just 'be' the integer with no wrapping at all and multiple keys in a data could be like a record where it just indexes into a tuple (although structs are nice for built-in elixir access though unless it were in a macro block you could control...).

If only Elixir had the functionality that my defguard library has, though I guess you could always integrate it and make a defex for an enhanced def or so?

@expede
Copy link
Member

expede commented Jul 5, 2017

{:Custom, red, green, blue}

The purpose of using structs is to enable protocols, and thus type classes for Witchcraft. I'm not aware of a way to implement a protocol on arbitrary tuples... is there a way that I've missed along the way?

@OvermindDL1
Copy link
Author

The purpose of using structs is to enable protocols, and thus type classes for Witchcraft. I'm not aware of a way to implement a protocol on arbitrary tuples... is there a way that I've missed along the way?

Not as Elixir does them now, protocols are very limited and have built-in very artificial limitations (WhyTF does it not allow for arbitrary match's as its implementation allows for?!?). You don't need to use protocols though, could build a new compiler plugin to make protocols actually 'correct' (and consequently you could make your type_class library significantly more powerful by being able to implement based on arbitrary match's instead of protocol's limitations ^.^).

But yeah, I don't get why protocols are limited to matching only on a base type or a map with a __struct__ key, seems very very arbitrary to me.

Hmm, I wonder if Elixir would accept a PR to allow for tuple-matching based on first element (as the BEAM is optimized for that case more so than almost any other, hmm...).

@OvermindDL1
Copy link
Author

I looked in Elixir's source, all it does for a protocol (when you defprotocol) is just make a module with some callbacks and such on it. When you defimpl a protocol it takes the named type you are defining and puts it after the protocol name, so implementing, say, Integer on a protocol puts .Integer after the protocol module name, thus:

iex(1)> defprotocol Blah do
...(1)>   def blah(blah)
...(1)> end
{:module, Blah,
 <<70, 79, 82, 49, 0, 0, 18, 60, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 1, 132,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:__protocol__, 1}}
iex(2)> defimpl Blah, for: Integer do
...(2)>   def blah(blah), do: blah * 2
...(2)> end
{:module, Blah.Integer,
 <<70, 79, 82, 49, 0, 0, 6, 140, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 206,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:__impl__, 1}}

For a protocol Blah implementing Integer makes Blah.Integer.

And look, you can do it for unknown types with ease:

iex(3)> defimpl Blah, for: Bloop do
...(3)>   def blah(blah), do: "Bloop: #{inspect(blah)}"
...(3)> end
{:module, Blah.Bloop,
 <<70, 79, 82, 49, 0, 0, 7, 136, 66, 69, 65, 77, 69, 120, 68, 99, 0, 0, 0, 206,
   131, 104, 2, 100, 0, 14, 101, 108, 105, 120, 105, 114, 95, 100, 111, 99, 115,
   95, 118, 49, 108, 0, 0, 0, 4, 104, 2, ...>>, {:__impl__, 1}}

The reason why, based on looking prior in the code, it looks like it was originally built for records (tagged tuple) instead of maps (since maps did not exist when this was first created), and the relevant code is at: https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/protocol.ex#L459

Adding tuple matching would be as simple as adding this right after/before that struct set:

      Kernel.def impl_for(tagged_tuple) when :erlang.is_tuple(tagged_tuple) and :erlang.is_atom(:erlang.elem(tagged_tuple, 0)) do
        struct_impl_for(tagged_tuple)
end

In addition to probably another case clause somewhere for consolidation purposes.
Using struct_impl_for is the same implementation would work, otherwise one of those too.
But yeah, wonder if they'd take a PR then, hmm...

@expede
Copy link
Member

expede commented Jul 6, 2017

Yeah, I had played around with making changes similar to what you describe above in type_class, but I also need to get these libs out the door 😉 (They've been in development for a while). I do like the idea of having protocols look for structs first, and falling back to looking at the first field of a record. Could do it as another Hex lib if they don't take it for the standard lib 🤔

But yeah, I don't get why protocols are limited to matching only on a base type or a map with a struct key, seems very very arbitrary to me.

Agreed. The one nice thing about always using a map currently is that we get field polymorphism, but you're right that it's at the cost of some performance.

I'm not sure how records would play with polymorphic for: Tuple definitions, since Record.is_record {:a} == true and is_tuple({Foo, 1, 2}) == true. Maybe adding a __is_record__ function to the module and checking that the module implements it in a macro to avoid the runtime cost 🤔 A surmountable problem to be sure, but my goal right now is to get through to Witchcraft 1.0.0 (this pre-alpha state is frustrating a lot of people). If we upgrade TypeClass to handle records later, Witchcraft should stay fully compatible.

@OvermindDL1
Copy link
Author

I'm not sure how records would play with polymorphic for: Tuple definitions

I'd think that a generic for: Tuple implementation would only be used if no matching record first?

I looked more over protocols, the code seems very old, it has been in Elixir well before even macro's worked well, I think it could be done entirely in macro's now without a compiler plugin by making the defprotocol define just interface first (as a submodule), having the implementations require that, then making the main class compiled 'later' by calling the module definition low-level call directly, then you could optimize the calls a lot better, maybe even entirely removing the cross-modules call so the BEAM can perform inlining (the BEAM will never inline cross-modules), which could make protocols quite a bit faster than even the 'consolidated' version is now. I might play with that in a library sometime... Hmm...

If we upgrade TypeClass to handle records later, Witchcraft should stay fully compatible.

Very true yep, the internal implementation should not be rely'd on by users anyway (dialyzer @opaque it for sure).

@OvermindDL1
Copy link
Author

Yeah JoseValim basically will not add in record or tagged tuple support again because they cannot come up with a reliable implementation.

Sooo, I was curious and tried to make my own 'enhanced protocols', post is here: https://elixirforum.com/t/record-protocols/6655/8?u=overminddl1

In short, it is very easy, and mine works with arbitrary matchers and guards too not just some restricted base set of types (this could make your type_class library a lot more powerful). ^.^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants