-
Notifications
You must be signed in to change notification settings - Fork 330
Pull requests: google/XNNPACK
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add weights packing which handles scales and biases also.
#6560
opened Jun 12, 2024 by
copybara-service
bot
Loading…
Update test and benchmark generation for blockwise kr>2
#6557
opened Jun 12, 2024 by
GregoryComer
Loading…
AVX512 VNNI microkernels 14x16c8, 14x8c8, 28x16c4
#6554
opened Jun 12, 2024 by
copybara-service
bot
Loading…
Add wrappers for the KleidiAI
qp8-qc4w
GEMM microkernels.
#6551
opened Jun 11, 2024 by
copybara-service
bot
Loading…
Add
copy_bias
helper functions and use it to refactor packing.c.
#6550
opened Jun 11, 2024 by
copybara-service
bot
Loading…
Enable AMX using pytorch/cpuinfo cpuinfo_has_x86_amx_int8()
#6543
opened Jun 10, 2024 by
copybara-service
bot
Loading…
Add f32 rsum RVV implementation microkernels, tests and config changes for LMUL 2, 4 and 8
#6533
opened Jun 6, 2024 by
KaustubhIMG
Loading…
LLM decode benchmarks fill the cache with a predefined number of tokens before starting decoding.
#6531
opened Jun 6, 2024 by
copybara-service
bot
Loading…
Add QS8_QC8W GEMM/IGEMM microkernels for Wasm Relaxed Unsigned and Signed …
#6505
opened May 30, 2024 by
fanchenkong1
Loading…
Avoid benchmark link errors in Bazel for platforms that don't have specializations for ~all kernels. Benchmarks should prefer to use the test/bench microkernels -- *not* the prod microkernels -- therefore they should only depend on
test_mode
dependencies for those that exist. Unfortunately, the Bazel build rules for benchmarks didn't really follow these guidelines, so new appropriate targets were added and depended on as appropriate.
#6490
opened May 28, 2024 by
copybara-service
bot
Loading…
Extend the
convert
operator for the new qp8
packed per-row dynamic quantization.
#6479
opened May 27, 2024 by
copybara-service
bot
Loading…
AVX2 qs8 rsum use vpmovsxbd to read bytes as ints
#6444
opened May 21, 2024 by
copybara-service
bot
Loading…
Prototype to integrate SW optimizations for Arm® CPUs
#6436
opened May 17, 2024 by
gmiodice
Loading…
Use a better error bound for
fp16
tests of the rsum
microkernel.
#6431
opened May 16, 2024 by
copybara-service
bot
Loading…
F32-RMINMAXSUM - add reduction sum to f32-rminmax
#6427
opened May 16, 2024 by
copybara-service
bot
Loading…
Add a new
x8-packq
microkernel that packs and per-row dynamically quantizes fp32
to qp8
.
#6424
opened May 15, 2024 by
copybara-service
bot
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.