Falcon7b prefill remaining fixes and cleanup after enabling optimized version #8349

s-jovic · 2024-05-10T09:59:31Z

Segfault for 1k and 2k prefill on multi-device (single run) and single device (when run in a loop): Seg fault in falcon7b prefil optimised attention #8644
Async mode (not all ops used in optimized prefill support async mode)
Make model initialization agnostic to sequence length
GS path - decide whether to use the optimized version for GS and if so make it work
Use appropriate memory config configuration (l1 sharded?)
Resolve lm head e2e perf impact
Push PCC to 0.99: [issue](Falcon7b prefill PCC below 0.99 #8487)
Reduce memory usage of e2e tests - the 2k test uses up to 80gb ram for some reason; try to run PyTorch and tt model sequentially

pavlepopovic · 2024-05-13T16:27:56Z

Check that all MMs have optimal settings (sunblock_h/w,, in0_block_w) following di/dt fixes (attention, MLP, LM head matmuls)
Check if any more sharding is possible throughout model

s-jovic · 2024-05-24T10:22:41Z

remove persistent kernel cache usage
put optimized attention on CI
unify paths for 128/1024/2k and other sequence lengths
perf breakdown
update perf targets in CI tests
add multi-chip 128, 1k, 2k prefill tests

s-jovic added the falcon_7B label May 10, 2024

s-jovic self-assigned this May 10, 2024

s-jovic added the P1_critical label May 10, 2024

s-jovic mentioned this issue May 10, 2024

#5592: Enable optimized prefill for Falcon7b #8310

Merged

This was referenced May 13, 2024

Provide support for async_mode for falcon7b/40b prefill #8404

Closed

Seg fault in falcon7b prefil optimised attention #8644

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Falcon7b prefill remaining fixes and cleanup after enabling optimized version #8349

Falcon7b prefill remaining fixes and cleanup after enabling optimized version #8349

s-jovic commented May 10, 2024 •

edited

pavlepopovic commented May 13, 2024

s-jovic commented May 24, 2024 •

edited

Falcon7b prefill remaining fixes and cleanup after enabling optimized version #8349

Falcon7b prefill remaining fixes and cleanup after enabling optimized version #8349

Comments

s-jovic commented May 10, 2024 • edited

pavlepopovic commented May 13, 2024

s-jovic commented May 24, 2024 • edited

s-jovic commented May 10, 2024 •

edited

s-jovic commented May 24, 2024 •

edited