Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Closes #2499) scalarization transformation implementation #2563

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
3 changes: 3 additions & 0 deletions src/psyclone/psyir/transformations/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@
ReplaceInductionVariablesTrans
from psyclone.psyir.transformations.reference2arrayrange_trans import \
Reference2ArrayRangeTrans
from psyclone.psyir.transformations.scalarization_trans import \
ScalarizationTrans


# For AutoAPI documentation generation
Expand Down Expand Up @@ -143,4 +145,5 @@
'Reference2ArrayRangeTrans',
'RegionTrans',
'ReplaceInductionVariablesTrans',
'ScalarizationTrans',
'TransformationError']
212 changes: 212 additions & 0 deletions src/psyclone/psyir/transformations/scalarization_trans.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,212 @@
# -----------------------------------------------------------------------------
# BSD 3-Clause License
#
# Copyright (c) 2017-2024, Science and Technology Facilities Council.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# * Neither the name of the copyright holder nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
# FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
# INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
# LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
# -----------------------------------------------------------------------------
# Author: A. B. G. Chalk, STFC Daresbury Lab

'''This module provides the sclarization transformation class.'''

import itertools

from psyclone.core import VariablesAccessInfo
from psyclone.psyGen import Kern
from psyclone.psyir.nodes import Assignment, Call, CodeBlock, IfBlock, \
Reference, Routine
from psyclone.psyir.symbols import DataSymbol
from psyclone.psyir.transformations.loop_trans import LoopTrans


class ScalarizationTrans(LoopTrans):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you add the docstring, can you bring what you wrote in apply here and add a simple example, e.g. like src/psyclone/psyir/transformations/hoist_trans.py . So then we just need to list the autoclass under the available transformations section of doc/user_guide/transformations.rst and can be tested with python -m doctest src/psyclone/psyir/transformations/scalarization_trans.py. The apply docstring can be a oneliner then.


def _find_potential_scalarizable_array_symbols(self, node, var_accesses):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this method hard to understand until I read the apply method. I think its name is too generic and it does too many things, so I suggest breaking it up (see name suggestions below).

I also notice that they all iterate over potential var_accesses and select/filter the ones passing the conditions. Could we use the python filter functionality? If the apply would be something like this:

var_accesses = VariablesAccessInfo(nodes=node.loop_body)
var_accesses = filter(self._is_local_array, var_accesses)
var_accesses = filter(self._have_same_unmodified_index, var_accesses)
var_accesses = filter(self._first_access_is_write, var_accesses)
var_accesses = filter(self._not_used_after_loop, var_accesses)

It would be very readable (so much so that I think it doesn't need inline comments) and makes each method simpler (to output array, no iteration, ...). Also the methods don't use self or node (except the last one but see next comment), should they be static methods or even just inner functions of apply?

Could you explore if this would work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its doable I think, I've moved them to be static methods (I don't like inner functions), they do still need var_accesses (and node in the final functions case), so there is a lambda in the filter calls, but it seems to work. I will need to rewrite tests for this and maybe expand the apply testing.


potential_arrays = []
signatures = var_accesses.all_signatures
for signature in signatures:
# Skip over non-arrays
if not var_accesses[signature].is_array():
continue
# Skip over non-local symbols
base_symbol = var_accesses[signature].all_accesses[0].node.symbol
if not base_symbol.is_automatic:
continue
array_indices = None
scalarizable = True
for access in var_accesses[signature].all_accesses:
if array_indices is None:
array_indices = access.component_indices
# For some reason using == on the component_lists doesn't work
elif array_indices[:] != access.component_indices[:]:
scalarizable = False
break
# For each index, we need to check they're not written to in
# the loop.
flattened_indices = list(itertools.chain.from_iterable(
array_indices))
for index in flattened_indices:
sig, _ = index.get_signature_and_indices()
if var_accesses[sig].is_written():
scalarizable = False
break
if scalarizable:
potential_arrays.append(signature)

return potential_arrays

def _check_first_access_is_write(self, node, var_accesses, potentials):
potential_arrays = []

for signature in potentials:
if var_accesses[signature].is_written_first():
potential_arrays.append(signature)

return potential_arrays

def _check_valid_following_access(self, node, var_accesses, potentials):
potential_arrays = []

for signature in potentials:
# Find the last access of each signature
last_access = var_accesses[signature].all_accesses[-1].node
# Find the next access to this symbol
next_access = last_access.next_access()
# If we don't use this again then its valid
if next_access is None:
potential_arrays.append(signature)
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting the check for next accesses to only be this. Why do we need the rest, is it because the next_access implementation is currently missing these? E.g. if we use next_accesses in other places, e.g. when deciding where to place a halo_exchage to be safe, wouldn't we need the same logic as below? In this case shouldn't this all be inside the next_access implementation?

Also I see that next access only returns one reference, was this the right choice?, given that potentially is more than one:

a = 3
if condition:
    a = 1
else:
    a = 1

Also the a in the inner loop has 2 potential next_acesses

do i=1, 10
     a = 1
     do j 1=1, 10
          a = 2
     endo
enddo
a = 3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling) like this, and just assume these are ok.

I can try to make a new issue and PR to improve that functionality if you think its necessary/useful. I feel like there can always be an incredibly complex structure of nested loops that makes this difficult.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But wouldn't this complex logic be for everywhere that we want to prove if a symbol is going to be accessed next or not, not just for this transformation?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately kept the next_access function simple and tried to avoid the complicated cases (and rely on the dependency analysis tooling)

I am not sure if it's next access or the VariablesAccessInfo tooling itself that needs to provide this.

@hiker and @JulienRemy I am trying to understand this. Is this related to the logic you needed on building the DAG? That the VariablesAccessInfo returns ALL following dependencies under a node (but not looking upwards - like we need in my loop snippet above) and the next_accesses just one is code order?

But what we need (and maybe next_access should be) is a method that return any "directly reachable" accesses to the same symbol, which could be more than one, and will be equivalent to the arrows leaving from a node in the DAG?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a separate issue (though I'm not sure about the loop case, we might need to gracefully fail somehow for some case, I'll try to remember to put a test in for that).

We otherwise ignore the if case here (which could be improved if next_acccess gave all options, either through the access info or otherwise) and return False instead of trying to work it out.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sergisiso I think I could probably do most of this for next access, butt I'm not sure on if there is an easy way to tell whether a node is inside an else if or an else or an if, for example for a case like (this is how we have the if/elseif behaviour in Psyclone)

a = 1
if cond1 then
   a = 2
else
  if cond2 then
    a = 3
    a = 6
  else
    if cond3 then
       a=4
end if 
end if
end if
a = 5

We currently would find the a=2 case, and the proposal is that we need to also find the a=3, a=4, and a=5 cases. The trick is knowing that the a=3 and a=4 is inside an else if case, so to continue trying to find more "next accesses" that aren't inside the same branch of the else/if structures. We also need different behaviour for else if vs else behaviour (and all cases are complicated by the "one branch doesn't contain a next_access")

# If we do and the next_access has an ancestor IfBlock
# that isn't an ancestor of the loop then its not valid since
# we aren't tracking down what the condition-dependent next
# use really is.
if_ancestor = next_access.ancestor(IfBlock)

# If abs_position of if_ancestor is > node.abs_position
# its not an ancestor of us.
if (if_ancestor is not None and
if_ancestor.abs_position > node.abs_position):
# Not a valid next_access pattern.
continue

# If next access is the LHS of an assignment, we need to
# check that it doesn't also appear on the RHS. If so its
# not a valid access
# I'm not sure this code is reachable
# if (isinstance(next_access.parent, Assignment) and
# next_access.parent.lhs is next_access and
# (next_access.next_access() is not None and
# next_access.next_access().ancestor(Assignment) is
# next_access.parent)):
# continue

# If next access is the RHS of an assignment then we need to
# skip it
ancestor_assign = next_access.ancestor(Assignment)
if (ancestor_assign is not None and
ancestor_assign.lhs is not next_access):
continue

# If it has an ancestor that is a CodeBlock or Call or Kern
# then we can't guarantee anything, so we remove it.
if (next_access.ancestor((CodeBlock, Call, Kern))
is not None):
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find hard to see what each of these are checking (without going to the tests), could you add a short code snipet showing the dependency that each is looking for, inlcuding the commented one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented one is removed because its not reachable with how the next_access/variable_access info gives results for assignments.


potential_arrays.append(signature)

return potential_arrays

def apply(self, node, options=None):
'''Apply the scalarization transformation to a loop.
All of the array accesses that are identified as being able to be
scalarized will be transformed by this transformation.

An array access will be scalarized if:
1. All accesses to the array use the same indexing statement.
2. All References contained in the indexing statement are not modified
inside of the loop (loop variables are ok).
3. The array symbol is either not accessed again or is written to
as its next access. If the next access is inside a conditional
that is not an ancestor of the input loop, then PSyclone will
assume that we cannot scalarize that value instead of attempting to
understand the control flow.
4. TODO - The array symbol is a local variable.

:param node: the supplied loop to apply scalarization to.
:type node: :py:class:`psyclone.psyir.nodes.Loop`
:param options: a dictionary with options for transformations.
:type options: Optional[Dict[str, Any]]

'''
# For each array reference in the Loop:
# Find every access to the same symbol in the loop
# They all have to be accessed with the same index statement, and
# that index needs to not be written to inside the loop body.
# For each symbol that meets this criteria, we then need to check the
# first access is a write
# Then, for each symbol still meeting this criteria, we need to find
# the next access outside of this loop. If its inside an ifblock that
# is not an ancestor of this loop then we refuse to scalarize for
# simplicity. Otherwise if its a read we can't scalarize safely.
# If its a write then this symbol can be scalarized.

var_accesses = VariablesAccessInfo(nodes=node.loop_body)

# Find all the ararys that are only accessed by a single index, and
# that index is only read inside the loop.
potential_targets = self._find_potential_scalarizable_array_symbols(
node, var_accesses)

# Now we need to check the first access is a write and remove those
# that aren't.
potential_targets = self._check_first_access_is_write(
node, var_accesses, potential_targets)

# Check the values written to these arrays are not used after this loop
finalised_targets = self._check_valid_following_access(
node, var_accesses, potential_targets)

routine_table = node.ancestor(Routine).symbol_table
# For each finalised target we can replace them with a scalarized
# symbol
for target in finalised_targets:
target_accesses = var_accesses[target].all_accesses
first_access = target_accesses[0].node
symbol_type = first_access.symbol.datatype.datatype
symbol_name = first_access.symbol.name
scalar_symbol = routine_table.new_symbol(
root_name=f"{symbol_name}_scalar",
symbol_type=DataSymbol,
datatype=symbol_type)
ref_to_copy = Reference(scalar_symbol)
for access in target_accesses:
node = access.node
node.replace_with(ref_to_copy.copy())