Batched index update of sharded arrays without communication? #20883

inailuig · 2024-04-23T09:00:14Z

inailuig
Apr 23, 2024

Assume I am given a 2-d array of data x and two 1-d index arrays i and j containing the index of a column in each row, all sharded along the leading axis.
I would like to update the element given by i with the element given by j in each row of x.

I implemented this using straightforward numpy array indexing (f1 and f2).
When compiled this does an all-gather of the data and then indexes into it.

I was wondering if there is a good way to avoid the unnecessary collective communication without having to resort to use masks (f3) or shard_map (f4)?

Example:

import os
os.environ["XLA_FLAGS"]="--xla_force_host_platform_device_count=2"

from functools import partial
import numpy as np
import jax
import jax.numpy as jnp
from jax.sharding import PositionalSharding, Mesh, PartitionSpec as P
from jax.experimental.shard_map import shard_map

pos_sharding = PositionalSharding(jax.devices())
mesh = Mesh(jax.devices(), axis_names='i')

x = np.random.uniform(size=(jax.device_count() * 2, 3))
i = np.random.randint(0, x.shape[1], len(x))
j = np.random.randint(0, x.shape[1], len(x))
x = jax.lax.with_sharding_constraint(x, pos_sharding.reshape(-1, 1))
i = jax.lax.with_sharding_constraint(i, pos_sharding)
j = jax.lax.with_sharding_constraint(j, pos_sharding)

@jax.jit
@jax.vmap
def f1(x, i, j):
    return x.at[i].set(x[j])
    
@jax.jit
def f2(x, i, j):
    a = jnp.arange(len(x))
    return x.at[a, i].set(x[a, j])

# masks
@jax.jit
@jax.vmap
def f3(x, i, j):
    a = jnp.arange(len(x))
    maski = i == a
    maskj = j == a
    return jnp.where(maski, x @ maskj, x)

# shard_map
@jax.jit
@partial(shard_map, mesh=mesh, in_specs=(P('i'), P('i'), P('i')), out_specs=P('i'))
def f4(x, i, j):
    return f1(x, i, j)

collective_ops = ['all-reduce', 'collective-permute', 'all-gather', 'all-to-all', 'reduce-scatter']
def any_in(ops, txt):
    return any(map(lambda x: x in txt, ops))

print(any_in(collective_ops, f1.lower(x, i, j).compile().as_text()))  # True
print(any_in(collective_ops, f2.lower(x, i, j).compile().as_text()))  # True
print(any_in(collective_ops, f3.lower(x, i, j).compile().as_text()))  # False
print(any_in(collective_ops, f4.lower(x, i, j).compile().as_text()))  # False

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batched index update of sharded arrays without communication? #20883

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Batched index update of sharded arrays without communication? #20883

inailuig Apr 23, 2024

Replies: 0 comments

inailuig
Apr 23, 2024