Is it really byte-level? #61

LuCeHe · 2023-09-28T12:25:38Z

From your paper it seems like the byte-level classification decomposes a character i.e. 'C' into its binary representation, something like 000101110, but your code gives back 68, which I think it's not what you intended, cause that is simply a char level representation.

Am I wrong?

Your dataset would be still fulfilling its purpose of using very long sequences, but I think it's not char-byte-level, but char-level.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it really byte-level? #61

Is it really byte-level? #61

LuCeHe commented Sep 28, 2023

Is it really byte-level? #61

Is it really byte-level? #61

Comments

LuCeHe commented Sep 28, 2023