-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] linear scan memory planning strategy #64348
Conversation
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 6d69dcd (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: pytorch_linux_xenial_py3_6_gcc5_4_test (1/2)Step: "Test" (full log | diagnosis details | 🔁 rerun)
|
Job | Step | Action |
---|---|---|
Lint / clang-format | Run clang-format | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Please report bugs/suggestions to the (internal) Dr. CI Users group.
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. [ghstack-poisoned]
@makslevental has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
Linear scan (a heuristic based on https://www.usenix.org/legacy/events/vee05/full_papers/p132-wimmer.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please take a look at #64348 (comment) before landing this?
// find the "right" region; in order of preference: | ||
// 1. tightest fit free region i.e. smallest i.e. first match since | ||
// avail_regions is sorted by size | ||
// 2. swap with latest ending active live range that is big enough (spilling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please elaborate how does 2. help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i mean it probably doesn't if the data is indicative but this is the heuristic from the original linear scan paper
http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf bottom page 5
namespace jit { | ||
|
||
// join regions that are adjacent and free | ||
void coalesce_avail(std::multiset<MemRegion, regionSizeCmp>& avail_regions) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's make this helper static, so it stays private to this file.
SortedLiveRangeMap<MemRegion> allocated_ranges; | ||
|
||
size_t curr_end_offset = 0; | ||
auto allocate_inactive_ranges = [&](UniqueLiveRange curr_range) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if refactoring this helper out of the mainline code would add to the readability?
a) we would immediately see which structures the helper updates, uses.
b) there would be less noise in the mainline code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. will do
continue; | ||
} | ||
|
||
TORCH_INTERNAL_ASSERT( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lmao nuclear shelter proof 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the wrong thing? should i use something else?
size_t candidate_size = candidate_reg.size; | ||
active[curr_range] = {candidate_offset, aligned_curr_size}; | ||
// split region (potentially) | ||
if (candidate_size > aligned_curr_size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like duplication of the logic above? Maybe we should refactor it into a helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's true but it's mostly duplication of unpacking structs and such. only duplicated logic is the check which is more intelligible in context i think. not sure - could go either way (we should flip a coin)
curr_end_offset += aligned_curr_size; | ||
} | ||
|
||
// expire any remaining intervals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this comment is very helpful, but I struggle to come up with something better. The only side effect you care about here is moving remaining intervals to allocated_ranges
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you meant *not very helpful? agreed. will add a couple more words
std::vector<MemAllocation> allocations; | ||
allocations.reserve(allocated_ranges.size()); | ||
for (const auto& item : allocated_ranges) { | ||
allocations.push_back({item.first, item.second}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err this looks fishy... you already reserved enough elements for allocated_ranges
but then you are pushing back those on top reserved ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
discussed offline
namespace torch { | ||
namespace jit { | ||
|
||
using EndSortedLiveRangeMap = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's only used in the cpp
file, why don't we move it there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can't remember if it's used in any of the other diffs but it's a vaguely useful typedef? not sure. another coin flip.
{448, 448, TTP(((Vec{10, 10})), ((Vec{10, 1})))}, | ||
{448, 896, TTP(((Vec{10, 10})), ((Vec{10, 1})))}, | ||
}; | ||
testSmall(expected_storage, expected_allocs, Strategy::LINEAR_SCAN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it all strategies are tested with only these two?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct. longer/more complicated models would've inflated the verification spec data too much i think
Linear scan (a heuristic based on http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf) memory planning strategy. Differential Revision: [D30769099](https://our.internmc.facebook.com/intern/diff/D30769099) [ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow For more information, please take a look at the CI Flow Wiki. |
@makslevental has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Hi @makslevental! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
Stack from ghstack:
Linear scan (a heuristic based on http://web.cs.ucla.edu/~palsberg/course/cs132/linearscan.pdf) memory planning strategy.
Differential Revision: D30769099