[Perf] Vectorize more dtype for int4mm #126512

malfet · 2024-05-17T05:51:09Z

It used to be vectorized only for f16, but no reason not to do the same for bf16 or f32

Spiritual followup of #125290

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

It used to be vectorized only for f16, but no reason not to do the same for bf16 or f32 Spiritual followup of #125290

pytorch-bot · 2024-05-17T05:51:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126512

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 5fff689 with merge base e3c5d1b ():

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

Lint / lintrunner-noclang / linux-job (gh) (trunk failure)
>>> Lint for torch/onnx/_internal/onnx_proto_utils.py:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pytorch-bot · 2024-05-17T15:19:02Z

Warning: Unknown label ciflow/aarch64.
Currently recognized labels are

ciflow/binaries
ciflow/binaries_conda
ciflow/binaries_libtorch
ciflow/binaries_wheel
ciflow/inductor
ciflow/inductor-perf-compare
ciflow/inductor-micro-benchmark
ciflow/linux-aarch64
ciflow/mps
ciflow/nightly
ciflow/periodic
ciflow/rocm
ciflow/slow
ciflow/trunk
ciflow/unstable
ciflow/xpu
ciflow/torchbench

Please add the new label to .github/pytorch-probot.yml

malfet · 2024-05-17T16:31:21Z

@pytorchbot merge -f "Lint , Mac test and aarch64 builds are green"

pytorchmergebot · 2024-05-17T16:34:10Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

It used to be vectorized only for f16, but no reason not to do the same for bf16 or f32 Spiritual followup of pytorch#125290 Pull Request resolved: pytorch#126512 Approved by: https://github.com/Skylion007

[Perf] Vectorize more dtype for int4mm

5fff689

It used to be vectorized only for f16, but no reason not to do the same for bf16 or f32 Spiritual followup of #125290

pytorch-bot bot added the module: cpu CPU specific problem (e.g., perf, algorithm) label May 17, 2024

malfet requested review from mingfeima, mikekgfb and jgong5 May 17, 2024 06:20

Skylion007 approved these changes May 17, 2024

View reviewed changes

malfet added topic: improvements topic category release notes: performance_as_product release notes category ciflow/trunk Trigger trunk jobs on your pull request ciflow/aarch64 labels May 17, 2024

malfet added the ciflow/linux-aarch64 linux aarch64 CI workflow label May 17, 2024

pytorchmergebot added the merging label May 17, 2024

pytorchmergebot closed this in 7e9a037 May 17, 2024

pytorchmergebot added Merged and removed merging labels May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf] Vectorize more dtype for int4mm #126512

[Perf] Vectorize more dtype for int4mm #126512

malfet commented May 17, 2024 •

edited

pytorch-bot bot commented May 17, 2024 •

edited

pytorch-bot bot commented May 17, 2024

malfet commented May 17, 2024

pytorchmergebot commented May 17, 2024

[Perf] Vectorize more dtype for int4mm #126512

[Perf] Vectorize more dtype for int4mm #126512

Conversation

malfet commented May 17, 2024 • edited

pytorch-bot bot commented May 17, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126512

✅ You can merge normally! (1 Unrelated Failure)

pytorch-bot bot commented May 17, 2024

malfet commented May 17, 2024

pytorchmergebot commented May 17, 2024

Merge started

malfet commented May 17, 2024 •

edited

pytorch-bot bot commented May 17, 2024 •

edited