Skip to content

Releases: lucidrains/mixture-of-experts

0.2.3

21 Aug 22:04
Compare
Choose a tag to compare
revert, 0.2.3

0.2.2

21 Aug 19:44
Compare
Choose a tag to compare
weighting is already done when computing combine_tensor

0.2.1

19 Dec 19:39
0b6bf96
Compare
Choose a tag to compare
0.2.1

0.2.0

20 Oct 22:20
Compare
Choose a tag to compare
make sure moe works with reversible networks for routing transformer

0.1.1

17 Jul 23:29
Compare
Choose a tag to compare
fix initialization of experts

0.1.0

17 Jul 22:59
Compare
Choose a tag to compare
default to Gelu activation

0.0.4

17 Jul 21:45
Compare
Choose a tag to compare
bump for release

0.0.3

17 Jul 21:33
Compare
Choose a tag to compare
add ability to pass in custom experts

0.0.2

17 Jul 20:54
Compare
Choose a tag to compare
complete first pass of heirarchical mixture of experts (2 levels) as …

…in GShard paper

0.0.1

17 Jul 00:27
Compare
Choose a tag to compare
update readme