Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] emit indexing code directly #4489

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

geohotstan
Copy link
Contributor

@geohotstan geohotstan commented May 9, 2024

There are a bunch of rough edges here I haven't ironed out, but here's a POC.
I wrote a bunch of hacks just to get it working.

I've tried doing symbolic index tensor with a new symbolic object, but I found myself needing a new UOps to deal with the loading for the render_ops
I saw how setitem was doing the store part with ShapeTracker, so I thought on a conceptual level the advanced indexing stuff should also be apart of the ShapeTracker.

The basic idea is just creating a non-mergable special View (IndexedVIew) that contains the Tensor indices which is then global_loaded and used to determine the offsets of the Buffer that holds the IndexedView when this Buffer is loaded/stored. The kernel code just indexes directly instead of arange masking.

Currently things that work:

  • Embedding
  • Advanced indexing
  • Multi index advanced indexing

Things that don't work:

  • Backward pass :)
  • Advanced setitem (bufs are not being loaded but kernel code looks correct)
  • More complex operations where advanced indexing has operations before it or after it (like basic indexing + advanced indexing mix)
  • Image dtype indexing? (the codepath looks wrong but I haven't tested)
  • And more, lol

There are a lot of garbage hacks in this currently
The main thing that works is the index() function in linearizer that correctly determines the offset given buffers used for indexing (doesn't work in all cases yet... bunch of hacks in Tensor getitem to get tests passing)
It's gonna take some more time for me to get this to a point where it's mergeable since I still don't fully get schedule and shapetracker. Or idk is this pursuable?


@dataclass(frozen=True)
class IndexedView(View):
# TODO: Any is LazyBuffer but circular import bleh
Copy link
Contributor Author

@geohotstan geohotstan May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

speaking of leaky abstractions
how's shoving a LB into a View 😎

Copy link
Contributor

Changes

Name                              Lines    Diff    Tokens/Line    Diff
------------------------------  -------  ------  -------------  ------
tinygrad/tensor.py                  802      +3           21.6    -0.1
tinygrad/codegen/linearizer.py      376     +39           18.9    -0.5
tinygrad/shape/view.py              260     +23           17.8    -0.4
tinygrad/engine/schedule.py         247      +4           14.0    -0.0
tinygrad/lazy.py                    160      +1           19.0    +0.0
tinygrad/nn/__init__.py              97      +1           19.3    -0.1
tinygrad/shape/shapetracker.py       95      +2           18.6    +0.3


total lines changes: +73

@chenyuxyz
Copy link
Collaborator

does this support backward for setitem?

@geohotstan
Copy link
Contributor Author

geohotstan commented May 18, 2024

does this support backward for setitem?

I haven't tried yet.
But if this were to support backward, I think I'll need to implement new functions in function.py for getitem (and maybe setitem too?). Though afterwards all indexing-based ops can use direct indexing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants