Multi Input/output transformers #236

RyanKim17920 · 2024-01-28T04:01:35Z

Refer to #235, I fixed naming conventions and all merging conflicts. I still haven't tested all the features completely so there may be errors.

- working on multi input/output along with multi input, created base along with multi output only (not tested yet)

- need to use memory tokens, probably will do later

- reverting x_transformers to prevent conflicts (I had used dataspell's auto format but it seems to have deleted some code) - I improved the naming of the attention layers

lucidrains · 2024-02-07T16:10:49Z

x_transformers/multi_IO_transformers.py

+            cache=None,
+            **kwargs
+    ):
+        global intermediates_model, cache_pre_attn_layers, cache_model, cache_post_attn_layers, intermediates_pre_attn_layers, intermediates_pre_attn_layer, mem_packed_shape, mem_every


are you using a global here for the caches?

Yes, I don't think it's completely necessary though, I just did it because my IDE wanted me to make them global. It probably could be deleted without causing much issues

i see, want to delete it from the PR?

lucidrains · 2024-02-07T16:12:24Z

@RyanKim17920 hey Ryan, this looks like good effort

are you using all the features present in this wrapper? i would suggest just removing everything except for the multi-io portion

RyanKim17920 · 2024-02-09T22:42:36Z

I believe most of the features should be working properly but I wasn't exactly too sure how to implement the memory-based features so those ones may not implemented well

lucidrains · 2024-02-10T16:01:44Z

@RyanKim17920 yea, you can just remove the memory-based features, as well as anything that isn't related to what you are working on. just allow the multi-io logic to shine through

could you also add an example execution, in the same style as the other examples in the readme?

lucidrains · 2024-02-10T16:02:22Z

@RyanKim17920 how well does it work for your project? have you trained anything with it?

RyanKim17920 · 2024-02-13T03:18:13Z

@RyanKim17920 how well does it work for your project? have you trained anything with it?

Ah, I haven't worked on it in a while so I haven't tested the model system yet. I was trying to generalize the transformers first so that the training process itself would work out smoother

- there are too many issues in the io wrapper currently, restarting from scratch

(tested features too)

- memory system does not work and need fixes, basic training can function though (without memory)

Fixed errors around this

- it was actually broken before, false IOtesting skills

padding applied for auto regressive xl - method to prevent loss on padding values because it would cause nan loss

errors in early breaking/module list fixed

RyanKim17920 added 7 commits January 14, 2024 22:41

first part of multi IO

e23b960

- working on multi input/output along with multi input, created base along with multi output only (not tested yet)

multi_input

118c0e6

- need to use memory tokens, probably will do later

Further Implementations

af03630

Continued Fixing

f988545

Working on fixing dimension errors

d1fedbf

Reverting x_transformers, improving naming

0edc777

- reverting x_transformers to prevent conflicts (I had used dataspell's auto format but it seems to have deleted some code) - I improved the naming of the attention layers

deleting unnecessary idea files

cdfa415

lucidrains reviewed Feb 7, 2024

View reviewed changes

RyanKim17920 added 17 commits March 12, 2024 17:30

autoregressive mutli Output

b1fd685

fixing io wrapper

2b00a70

- there are too many issues in the io wrapper currently, restarting from scratch

Bug fixes and fixing features

a216274

(tested features too)

tests

57dc592

removing incomplete code

3de3d2d

xl_multiIO basics done

93a6478

- memory system does not work and need fixes, basic training can function though (without memory)

xl-regressive working

ce21fc2

ModuleLists instead of Lists

4d60e17

Fixed errors around this

xl memory system fixed

af9aee3

- it was actually broken before, false IOtesting skills

padding embeddings fixed

ec045b7

masking for pad values

d574b8c

padding for auto_mO working

6f5bd9e

padding for xl_auto/safe end

47ca030

padding applied for auto regressive xl - method to prevent loss on padding values because it would cause nan loss

index_eos_tokens implemented

055956f

fixed device issues

11ecdfa

fixed .to(device)

1385f09

bug fixes

53bf11c

errors in early breaking/module list fixed

RyanKim17920 added 8 commits March 27, 2024 12:38

shift_mem implemented

5369d41

fixed masking issues

c41a8b5

gpu fix

c0ce1cc

padding adjustments made

4604725

padding adjustment additions

87bd5b7

masking, index_eos_token fixes

9d7dd17

minor fixes

a30ecb2

small fixes, mems

1de223a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi Input/output transformers #236

Multi Input/output transformers #236

RyanKim17920 commented Jan 28, 2024

lucidrains Feb 7, 2024

RyanKim17920 Feb 9, 2024

lucidrains Feb 10, 2024

lucidrains commented Feb 7, 2024 •

edited

RyanKim17920 commented Feb 9, 2024

lucidrains commented Feb 10, 2024 •

edited

lucidrains commented Feb 10, 2024

RyanKim17920 commented Feb 13, 2024 •

edited

Multi Input/output transformers #236

Are you sure you want to change the base?

Multi Input/output transformers #236

Conversation

RyanKim17920 commented Jan 28, 2024

lucidrains Feb 7, 2024

Choose a reason for hiding this comment

RyanKim17920 Feb 9, 2024

Choose a reason for hiding this comment

lucidrains Feb 10, 2024

Choose a reason for hiding this comment

lucidrains commented Feb 7, 2024 • edited

RyanKim17920 commented Feb 9, 2024

lucidrains commented Feb 10, 2024 • edited

lucidrains commented Feb 10, 2024

RyanKim17920 commented Feb 13, 2024 • edited

lucidrains commented Feb 7, 2024 •

edited

lucidrains commented Feb 10, 2024 •

edited

RyanKim17920 commented Feb 13, 2024 •

edited