Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[converter] implement torch's aten::scaled_dot_product_attention operator #265

Open
mjamroz opened this issue Oct 28, 2023 · 2 comments
Open
Labels
enhancement New feature or request work/medium work that can be done within 1 day

Comments

@mjamroz
Copy link

mjamroz commented Oct 28, 2023

Is there any chance to implement torch aten::scaled_dot_product_attention?

https://pytorch.org/docs/master/generated/torch.nn.functional.scaled_dot_product_attention.html says it could be done as

# Efficient implementation equivalent to the following:
scale_factor = 1 / math.sqrt(Q.size(-1)) if scale is None else scale
attn_mask = torch.ones(L, S, dtype=torch.bool).tril(diagonal=0) if is_causal else attn_mask
attn_mask = attn_mask.masked_fill(not attn_mask, -float('inf')) if attn_mask.dtype==torch.bool else attn_mask
attn_weight = torch.softmax((Q @ K.transpose(-2, -1) * scale_factor) + attn_mask, dim=-1)
attn_weight = torch.dropout(attn_weight, dropout_p)
return attn_weight @ V
@mjamroz mjamroz changed the title implement torch's aten::scaled_dot_product_attention operator [converter] implement torch's aten::scaled_dot_product_attention operator Oct 28, 2023
@peterjc123 peterjc123 added enhancement New feature or request work/medium work that can be done within 1 day labels Oct 28, 2023
@peterjc123
Copy link
Collaborator

Before it is implemented, I guess it is easier if you replace F.scaled_dot_product_attention with the given implementation.

@peterjc123
Copy link
Collaborator

In order to get it supported, we will need to support the new TorchScript schema first. Also, we should probably figure out better way to reuse the conversation logic of the previously-supported ops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request work/medium work that can be done within 1 day
Projects
None yet
Development

No branches or pull requests

2 participants