New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(Closes #2499) scalarization transformation implementation #2563
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2563 +/- ##
=======================================
Coverage 99.86% 99.86%
=======================================
Files 357 358 +1
Lines 48332 48407 +75
=======================================
+ Hits 48266 48341 +75
Misses 66 66 ☔ View full report in Codecov by Sentry. |
@hiker Quick question on how the ordering of statements goes with the dependency system - if you have an assignment such as |
@sergisiso this is ready for a first look. I'm waiting for hiker to clarify my question before I remove the code at lines 123 in the transformation, but I expect to remove those lines as I think they're unreachable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good initial implementation @LonelyCat124, I am just asking if the code can be slightly refactored to be more readable and I am wondering if some logic should leave inside the next_access methods, I am happy to discuss this one.
|
||
class ScalarizationTrans(LoopTrans): | ||
|
||
def _find_potential_scalarizable_array_symbols(self, node, var_accesses): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this method hard to understand until I read the apply method. I think its name is too generic and it does too many things, so I suggest breaking it up (see name suggestions below).
I also notice that they all iterate over potential var_accesses and select/filter the ones passing the conditions. Could we use the python filter functionality? If the apply would be something like this:
var_accesses = VariablesAccessInfo(nodes=node.loop_body)
var_accesses = filter(self._is_local_array, var_accesses)
var_accesses = filter(self._have_same_unmodified_index, var_accesses)
var_accesses = filter(self._first_access_is_write, var_accesses)
var_accesses = filter(self._not_used_after_loop, var_accesses)
It would be very readable (so much so that I think it doesn't need inline comments) and makes each method simpler (to output array, no iteration, ...). Also the methods don't use self or node (except the last one but see next comment), should they be static methods or even just inner functions of apply?
Could you explore if this would work.
from psyclone.psyir.transformations.loop_trans import LoopTrans | ||
|
||
|
||
class ScalarizationTrans(LoopTrans): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you add the docstring, can you bring what you wrote in apply here and add a simple example, e.g. like src/psyclone/psyir/transformations/hoist_trans.py
. So then we just need to list the autoclass under the available transformations section of doc/user_guide/transformations.rst
and can be tested with python -m doctest src/psyclone/psyir/transformations/scalarization_trans.py
. The apply docstring can be a oneliner then.
# Find the last access of each signature | ||
last_access = var_accesses[signature].all_accesses[-1].node | ||
# Find the next access to this symbol | ||
next_access = last_access.next_access() | ||
# If we don't use this again then its valid | ||
if next_access is None: | ||
potential_arrays.append(signature) | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expecting the check for next accesses to only be this. Why do we need the rest, is it because the next_access implementation is currently missing these? E.g. if we use next_accesses in other places, e.g. when deciding where to place a halo_exchage to be safe, wouldn't we need the same logic as below? In this case shouldn't this all be inside the next_access implementation?
Also I see that next access only returns one reference, was this the right choice?, given that potentially is more than one:
a = 3
if condition:
a = 1
else:
a = 1
Also the a
in the inner loop has 2 potential next_acesses
do i=1, 10
a = 1
do j 1=1, 10
a = 2
endo
enddo
a = 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling) like this, and just assume these are ok.
I can try to make a new issue and PR to improve that functionality if you think its necessary/useful. I feel like there can always be an incredibly complex structure of nested loops that makes this difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But wouldn't this complex logic be for everywhere that we want to prove if a symbol is going to be accessed next or not, not just for this transformation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling)
I am not sure if it's next access or the VariablesAccessInfo tooling itself that needs to provide this.
@hiker and @JulienRemy I am trying to understand this. Is this related to the logic you needed on building the DAG? That the VariablesAccessInfo returns ALL following dependencies under a node (but not looking upwards - like we need in my loop snippet above) and the next_accesses just one is code order?
But what we need (and maybe next_access should be) is a method that return any "directly reachable" accesses to the same symbol, which could be more than one, and will be equivalent to the arrows leaving from a node in the DAG?
# If we do and the next_access has an ancestor IfBlock | ||
# that isn't an ancestor of the loop then its not valid since | ||
# we aren't tracking down what the condition-dependent next | ||
# use really is. | ||
if_ancestor = next_access.ancestor(IfBlock) | ||
|
||
# If abs_position of if_ancestor is > node.abs_position | ||
# its not an ancestor of us. | ||
if (if_ancestor is not None and | ||
if_ancestor.abs_position > node.abs_position): | ||
# Not a valid next_access pattern. | ||
continue | ||
|
||
# If next access is the LHS of an assignment, we need to | ||
# check that it doesn't also appear on the RHS. If so its | ||
# not a valid access | ||
# I'm not sure this code is reachable | ||
# if (isinstance(next_access.parent, Assignment) and | ||
# next_access.parent.lhs is next_access and | ||
# (next_access.next_access() is not None and | ||
# next_access.next_access().ancestor(Assignment) is | ||
# next_access.parent)): | ||
# continue | ||
|
||
# If next access is the RHS of an assignment then we need to | ||
# skip it | ||
ancestor_assign = next_access.ancestor(Assignment) | ||
if (ancestor_assign is not None and | ||
ancestor_assign.lhs is not next_access): | ||
continue | ||
|
||
# If it has an ancestor that is a CodeBlock or Call or Kern | ||
# then we can't guarantee anything, so we remove it. | ||
if (next_access.ancestor((CodeBlock, Call, Kern)) | ||
is not None): | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find hard to see what each of these are checking (without going to the tests), could you add a short code snipet showing the dependency that each is looking for, inlcuding the commented one.
First implementation is here. I have unit tests for the private routines, I need to test it with code examples doing the full transformation properly but in theory its implemented here for now.