Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we want maps? #113

Open
timfel opened this issue Jan 11, 2016 · 11 comments
Open

Do we want maps? #113

timfel opened this issue Jan 11, 2016 · 11 comments

Comments

@timfel
Copy link
Member

timfel commented Jan 11, 2016

@krono and I were thinking about where to replace storage strategies (and maybe some shadows) by using maps. This issue is to track some discussion about what we want, where we need shadows.

where do we want storage strategies

  • arrays
  • dictionaries (?)
  • ... that's it
  • ... could we just move that into a "storage strategy node"?

where do we want shadows

class shadows

The kinds of attributes here change rarely, mostly new classes are created at runtime to replace old ones. So maybe we could get rid of class shadows and just use maps and transitions (with some special nodes? Like in topaz, were object flags are FlagNodes parallel to AttributeNodes) without generating more code.

The following things are stored, and could be represented with special map nodes:

  • instance_size
  • instance_kind
  • is_variable

The following two things can just live as attributes (with nice accessors):

  • superclass
  • methoddict

Do we need this? If so, it could also be a special map node

  • subclasses (?)

method dict shadow

Again, I'm not sure we need the shadow. Lookup could be optimized with special nodes when storing. We would search for the name (stored as symbol, elidable) and then return the index, and then lookup in the compiled methods array. one guard for the object map, one for the class map, then we have the symbol idx, then one pointer dereference to get the compiled method array, then a getarrayitem. Maybe we can leverage the fact that the compiled method array is never manipulated directly - rather, a new array is created and the new and old one are swapped using become.

  • array with compiled methods
  • indexed symbols

observee shadow

only for compiled method array, not required if we were to get rid of the method dict shadow

context part shadow / frame

These shadows double as the virtualizable frame objects. When ContextParts are loaded from the image, we could make them be special W_Objects (not generic W_PointersObject) that have a _shadow field. When they are dynamically created, only the shadow is created, and only a heap dump would call w_self(), and there we can decide what we create (instead of W_PointersObject).

krono added a commit that referenced this issue Jan 11, 2016
@j4yk
Copy link
Contributor

j4yk commented Jan 17, 2016

Probably, I am missing a part of the concept, but why do we want maps at all? If I understand them correctly as motivated in the PyPy blog, they help when the slot schema of an object is variable at runtime, but in practice it rarely changes. In Smalltalk, there are no instVarNames in the bytecode, so a mapping from attribute names to storage indices as shown in the blog posts is not necessary. When a class is changed in a notable manner, are not all subInstances of that class become:-ed to fresh instances anyway? So the slot schema of an allocated Smalltalk object does never change. Correct me, if I am mistaken.

If I understood @krono correctly, we want to get rid of the dozens of guards when accessing storage. Assuming we want to retain unboxed storage for Collections which happen to only store SmallIntegers etc. how do maps help here?

@timfel
Copy link
Member Author

timfel commented Jan 17, 2016

My thought is that we have a different implementation type for arrays. the array class is in the special objects table, so we would know early on while loading the image when to create a W_Array (which uses strategies) and when to use W_PointersObject (which will use maps).

@j4yk
Copy link
Contributor

j4yk commented Jan 17, 2016

But when the size of an object cannot change and all instance variables are accessed by index, why do you want to use maps for ordinary W_PointersObjects? Which guards can be avoided or what is more efficient with maps than without? Will the SmallIntegerInstVarAccessor (currently misspelled) allow unboxed integers in instance variables?

@krono
Copy link
Member

krono commented Jan 17, 2016

That's the plan. What's changing here is not the number of fields but their runtime types.

@timfel
Copy link
Member Author

timfel commented Jan 17, 2016

@j4yk we will have direct pointer fields and direct integer fields on each object, and fill them according to need. any variables beyond what has space in the directly accessed fields will go into a generic storage array that does not use strategies (or maybe it could, but I don't think it's necessary). The reason for maps is, that the mapping of small-integer and pointer fields on the object to instance variable indices is not fixed, so maps should be able to help us to use the direct fields effectively.

@j4yk
Copy link
Contributor

j4yk commented Jan 17, 2016

That makes sense, thank you both. But we will have an additional memory overhead from these fields for small objects with less instance variables, right?

@timfel
Copy link
Member Author

timfel commented May 12, 2016

This might still be nice to have, but we've added mutable integers in ordinary pointers objects now, which gets rid a few allocations for a number of cases. Not as much as we'd like, but good enough, maybe.

@timfel
Copy link
Member Author

timfel commented Feb 22, 2017

There is a maps implementation in tim/maps. A map is simply a kind of strategy that is used for fixed size pointers objects. The maps branch also adds inline fields for pointers objects, that then are used in the maps. This is kind of the strategy that @smarr uses in SOM, but it seems BitBlt gets slower with this, because there are many lazy initializers... :(

It makes some benchmarks better, and others worse. The ones that get worse are mostly because of instability in the maps. There are lots of lazy initializers and BitBltSimulator is megamorphic ... so, meh, not sure if that should still be merged and improved :(

@smarr
Copy link
Contributor

smarr commented Feb 22, 2017

What exactly do you mean with instability? Because the lazy initializers run too late and then cause code to be invalidated? I remember @cfbolz talking about improving over a basic object layout strategy for PyPy? Perhaps that would be applicable?

And, did you perhaps look at the Truffle Object Model paper?

There the strategy is more aggressively geared towards accounting for such uses, and, I would assume, PyPy's strategy might also be more lenient in a sense then what I got in SOM.
In SOM, I have only one map/shape/layout per class, which means lazy initialization leads to code invalidation.

However, if you take for instance the approach of Truffle's object model, each allocation site has potentially a separate tree of shapes. Of course, that has other issues like higher polymorphism, which might need to be solved separately. And, I am not sure, but I think @cfbolz was working on guards for PyPy that can be more complex than a simple comparison, which might also be helping with avoiding separate compilation for compatible shapes/maps.

@timfel
Copy link
Member Author

timfel commented Feb 22, 2017

@smarr Thanks I hadn't read that paper.

The issue we're having comes down to higher Polymorphism, I think: especially in BitBlt, the values of 5 or 6 different fields in the argument to the primitive lead to completely different initializations of a subset of the 64 instance variables of the BitBltSimulator class - by the time we get to the inner loops that move bits to the screen, we're looking at about 20 different possible shapes here. The allocation site in every instance is the same, they just go down different paths later on. I will take a look at the Truffle paper, thanks for the pointer 👍

@smarr
Copy link
Contributor

smarr commented Feb 22, 2017

Hm, yeah, sounds like that's also a problem with what Truffle does.
That would mean you get 20 different shapes, indeed.

Are those 20 shapes truly different, or could your group them? Things like initializing first a and then b, or first b and then a, ideally leads to the same shape.

Perhaps it is time to just fix BitBltSimulator ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants