Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Double index lookup in tasklet breaks under simplification #1281

Open
gronerl opened this issue Jun 21, 2023 · 6 comments · May be fixed by #1406
Open

Double index lookup in tasklet breaks under simplification #1281

gronerl opened this issue Jun 21, 2023 · 6 comments · May be fixed by #1406

Comments

@gronerl
Copy link
Collaborator

gronerl commented Jun 21, 2023

The following small graph with a indirection (double index lookup) in a tasklet breaks under simplify passes:

import dace

sdfg = dace.SDFG('test_')
state = sdfg.add_state()

sdfg.add_array('A', shape=(10,), dtype=dace.float64)
sdfg.add_array('table', shape=(10, 2), dtype=dace.int64)
sdfg.add_array('B', shape=(10,), dtype=dace.float64)
sdfg.add_scalar('idx', dace.int64, transient=True)
idx_node = state.add_access('idx')
set_tlet = state.add_tasklet('set_idx', code="_idx=0", inputs={}, outputs={"_idx"})
state.add_mapped_tasklet('map', map_ranges={'i':"0:10"}, inputs={'inp': dace.Memlet("A[0:10]"),'_idx': dace.Memlet('idx[0]'), 'indices': dace.Memlet('table[0:10, 0:2]')},code="out = inp[indices[i,_idx]]", outputs={'out': dace.Memlet("B[i]")},  external_edges=True,
                         input_nodes={'idx':idx_node})

state.add_edge(set_tlet, '_idx', idx_node, None, dace.Memlet('idx[0]'))

Namely, the indirection gets moved to an edge (the subset reading "A[indices(i, 0)]") by simplification which is not a valid subset.

@BenWeber42
Copy link
Contributor

I looked into this issue and I believe this is due to the following behavior:

from dace import symbolic
symbolic.pystr_to_symbolic("a[i]").free_symbols
# gives {i}
symbolic.pystr_to_symbolic("indices[i,_idx]").free_symbols
# gives {_idx, i}

With the given tasklet code:

out = inp[indices[i,_idx]]

This means, that DaCe thinks the expression indices[i,_idx] is independent of indices. Because i and _idx are given earlier, DaCe pushes the expression indices[i,_idx] (incorrectly) into the memlet.

I think we want a behavior such that:

symbolic.pystr_to_symbolic("a[i]").free_symbols
# gives {i, a}

I am currently looking into how to best implement this. There is a risk that changing this could break other parts of the codebase...

@tbennun
Copy link
Collaborator

tbennun commented Sep 29, 2023

Since the issue only happens during simplification, and the only pass that creates such memlets is ScalarToSymbolPromotion, shouldn’t it be there? Specifically TaskletIndirectionPromoter should avoid promoting expressions that will make invalid symbolic expressions, e.g., containing an ast.Subscript in this case (but there may be other invalid cases?).

@BenWeber42
Copy link
Contributor

What is considered a valid symbolic expression? And what is considered a valid expression for a memlet?

I guess for memlets the expressions are constrained to dace.properties.SubsetProperty?
For symbolic expressions, it's determined by dace.symbolic.pystr_to_symbolic?

Because both seem to allow the double index currently. Or at least neither give a validation error for a double index expression.

@tbennun
Copy link
Collaborator

tbennun commented Oct 2, 2023

A symbolic expression is a combination of symbols and operations on those symbols. The operations are arithmetic or logical unary/binary/ternary operators on those symbols, or a set of predefined math functions in SymPy (i.e., ones that have a set of identities that can be used to simplify them) on those symbolic expressions. Array accesses and attribute queries are not considered symbolic expressions (as arrays/data containers are not symbols).

A memlet must have a single data container it's accessing and one or two subsets (which are symbolic expressions as per the above definition).

Inter-state edges extend this definition lightly (because they can access data containers), and they can also involve array accesses (defined as the array name becoming a function call, e.g., A(i, 8*j+2)) as well as attributes (Attr(A, 'x')). The reason we did that was to allow us to simplify expressions in the inter-state edges. I am not entirely sure this is necessary at all.


For the last point, if validation doesn't fail on bad expressions, it should.

@BenWeber42
Copy link
Contributor

I am currently testing this solution: #1406

@BenWeber42 BenWeber42 linked a pull request Oct 31, 2023 that will close this issue
@BenWeber42
Copy link
Contributor

Properly fixing this issue requires changes/improvements for querying symbols in symbolic expressions. However, that is currently not working well and would likely require substantial work (see #1417).

Additionally, validation is missing a check for this bug. I am tracking this here: #1418

Splitting these tasks allows us to focus on a practical work-around for now (likely #1406).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants