linearizer: enable GROUP opts after TC #4408

flammit · 2024-05-03T19:49:26Z

fixes strides on group_for_reduce buffer to enable GROUP to be used correctly after TC. also reverses the local index order to be first ones on the left, new LOCALs on the right before self.first_reduce.

these tests are versus master 2024-05-05 at 7094100

for tiny red / HSA=1:

branch	settings	setup (mm:ss)	step time (ms)	flops (TF)
master	default	25:30	319ms	119TF
master	*large beam	47:59	308ms	123TF
enable_group_after_tc	default	25:52	314ms	121TF
enable_group_after_tc	*large beam	48:02	308ms	123TF

for tiny green / CUDA=1:

branch	settings	setup (mm:ss)	step time (ms)	flops (TF)
master	default	27:56	229ms	166TF
master	*large beam	76:23	223ms	170TF
enable_group_after_tc	default	26:35	233ms	163TF
enable_group_after_tc	*large beam	78:37	227ms	167TF

*large beam is: TRAIN_BEAM=8 IGNORE_JIT_FIRST_BEAM=1 BEAM_UOPS_MAX=2500 BEAM_UPCAST_MAX=128 BEAM_LOCAL_MAX=1024 BEAM_MIN_PROGRESS=5 BEAM_PADTO=0

flammit · 2024-05-03T22:01:27Z

I'm a little torn on this PR. I know it can help certain kernels, but it's making fast kernels just slightly faster at the cost of expanding the action space. It is a correct action though :/

Running a deeper beam to show potential performance gains and also a DEPTH=4 FUZZ_ALL_ACTIONS=1 FUZZ_N=1024 FUZZ_REQUIRE_TC=1 METAL=1 with fuzz_linearizer.

flammit · 2024-05-06T03:24:08Z

updating the timing for master. given that this makes CUDA slower, it probably needs to be held behind a flag :(

chenyuxyz · 2024-05-06T03:46:47Z

can you isolate the reversing local_idxs order part?

fixes strides on group_for_reduce buffer to enable GROUP to be used correctly after TC. also reverses the local index order to be first ones on the left. fix tests that depend on local order and remove duplicated test

github-actions · 2024-05-09T04:28:08Z

Changes

Name                              Lines    Diff    Tokens/Line    Diff
------------------------------  -------  ------  -------------  ------
tinygrad/codegen/kernel.py          463      +0           18.4    +0.1
tinygrad/codegen/linearizer.py      328      +3           19.6    +0.2


total lines changes: +3

chenyuxyz · 2024-05-09T05:12:28Z

fyi tried this on latest master, does not seem to be faster with current beam settings

flammit marked this pull request as draft May 3, 2024 19:49

flammit force-pushed the enable_group_after_tc branch 2 times, most recently from 8fb694c to 57fe812 Compare May 3, 2024 20:48

flammit marked this pull request as ready for review May 3, 2024 22:00

flammit changed the title ~~[DRAFT] linearizer: enable GROUP opts after TC~~ linearizer: enable GROUP opts after TC May 3, 2024

flammit marked this pull request as draft May 3, 2024 22:04

flammit force-pushed the enable_group_after_tc branch 4 times, most recently from cdf40ab to 29091bc Compare May 6, 2024 02:33

flammit marked this pull request as ready for review May 6, 2024 03:24

flammit marked this pull request as draft May 6, 2024 15:54

flammit force-pushed the enable_group_after_tc branch 4 times, most recently from 71a1471 to dd798c3 Compare May 7, 2024 20:15

linearizer: enable GROUP opts after TC

74d7c0d

fixes strides on group_for_reduce buffer to enable GROUP to be used correctly after TC. also reverses the local index order to be first ones on the left. fix tests that depend on local order and remove duplicated test

flammit force-pushed the enable_group_after_tc branch from dd798c3 to 74d7c0d Compare May 7, 2024 22:49

Merge branch 'master' into enable_group_after_tc

1ff6118

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linearizer: enable GROUP opts after TC #4408

linearizer: enable GROUP opts after TC #4408

flammit commented May 3, 2024 •

edited

flammit commented May 3, 2024 •

edited

flammit commented May 6, 2024

chenyuxyz commented May 6, 2024

github-actions bot commented May 9, 2024

chenyuxyz commented May 9, 2024

linearizer: enable GROUP opts after TC #4408

Are you sure you want to change the base?

linearizer: enable GROUP opts after TC #4408

Conversation

flammit commented May 3, 2024 • edited

flammit commented May 3, 2024 • edited

flammit commented May 6, 2024

chenyuxyz commented May 6, 2024

github-actions bot commented May 9, 2024

Changes

chenyuxyz commented May 9, 2024

flammit commented May 3, 2024 •

edited

flammit commented May 3, 2024 •

edited