Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Evaluation Scripts #7

Open
A-Ayerh opened this issue Jan 4, 2024 · 9 comments
Open

Issues with Evaluation Scripts #7

A-Ayerh opened this issue Jan 4, 2024 · 9 comments

Comments

@A-Ayerh
Copy link

A-Ayerh commented Jan 4, 2024

This issue is related to commit fbaf82d

After running the script: bash ./scripts/eval_decoding.sh , the results came out to be:

corpus BLEU-1 score: 0
corpus BLEU-2 score: 0
corpus BLEU-3 score: 0
corpus BLEU-4 score: 0

{'rouge-1': {'r': 0.0960104371521744, 'p': 0.13671808632706614, 'f': 0.10633835733307583}, 'rouge-2': {'r': 0.011719396402741052, 'p': 0.013988694184239035, 'f': 0.01133032845861094}, 'rouge-l': {'r': 0.09090843088332022, 'p': 0.12862700453138184, 'f': 0.10046980133298505}}


Removing the .squeeze and .tolist may have some affect on the results...

I'll be working on this as well @MikeWangWZHL , thanks for acting fast!

@A-Ayerh A-Ayerh changed the title Issues with Evaluation Scripts and Missing Files for EEG-to-Text Decoding Model Issues with Evaluation Scripts Jan 8, 2024
@A-Ayerh
Copy link
Author

A-Ayerh commented Jan 8, 2024

My predicted string in the BrainTranslator-all_decoding_result.txt file are all the same, strangely.

Ex:

target string: Everything its title implies, a standard-issue crime drama spat out from the Tinseltown assembly line.
predicted string: </s><s><s><s>He was born in the United States and raised in the UK.</s>
################################################

target string: This odd, poetic road movie, spiked by jolts of pop music, pretty much takes place in Morton's ever-watchful gaze -- and it's a tribute to the actress, and to her inventive director, that the journey is such a mesmerizing one.
predicted string: </s><s><s><s>He was born in the United States and raised in the UK.</s>
################################################

Perhaps my terminal message before presenting the BLEU scores may be relevant:

[INFO]subjects: ALL
[INFO]eeg type: GD
[INFO]using bands: ['_t1', '_t2', '_a1', '_a2', '_b1', '_b2', '_g1', '_g2']
[INFO]using device cuda:1

[INFO]loading 3 task datasets
[INFO]using subjects: ['ZAB', 'ZDM', 'ZDN', 'ZGW', 'ZJM', 'ZJN', 'ZJS', 'ZKB', 'ZKH', 'ZKW', 'ZMG', 'ZPH']
train divider = 320
dev divider = 360
[INFO]initializing a test set...
++ adding task to dataset, now we have: 456
[INFO]using subjects: ['ZAB', 'ZDM', 'ZDN', 'ZGW', 'ZJM', 'ZJN', 'ZJS', 'ZKB', 'ZKH', 'ZKW', 'ZMG', 'ZPH']
train divider = 240
dev divider = 270
[INFO]initializing a test set...
discard length zero instance: He was the son of a blacksmith Timothy Bush, Jr. and Lydia Newcomb and was born in Penfield, Monroe Co., New York on January 28, 1797.
discard length zero instance: Mary Lilian Baels (November 28, 1916 - June 7, 2002) was best known as Princess de Ruthy, the controversial morganatic second wife of King Leopold III of the Belgians.
++ adding task to dataset, now we have: 806
[INFO]using subjects: ['YSD', 'YSL', 'YDG', 'YLS', 'YMS', 'YAC', 'YFS', 'YDR', 'YAG', 'YTL', 'YFR', 'YMD', 'YRK', 'YAK', 'YIS', 'YRH', 'YRP', 'YHS']
train divider = 279
dev divider = 313
[INFO]initializing a test set...
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
expect word eeg embedding dim to be 840, but got 0, return None
++ adding task to dataset, now we have: 1407
[INFO]input tensor size: torch.Size([56, 840])

[INFO]test_set size: 1407

@underkongkong
Copy link

I'm facing the same problem as all generated sentences are the same sentence. It seems that the pre-trained encoder makes all the features similar.

@A-Ayerh
Copy link
Author

A-Ayerh commented Jan 8, 2024

@underkongkong Have you tried playing around with the config file parameters yet?

I wasn't sure if that would make a big difference.

@aysrox
Copy link

aysrox commented Jan 14, 2024

In my case the predicted string was like something:
predicted string: He was born in the United States and studied at New York University.

Not sure how to fix this...

@yanlirock
Copy link

same here

@underkongkong
Copy link

Anyone solved this problem?

@MikeWangWZHL
Copy link
Owner

Thanks for everyone's effort in the discussion; I haven't got time to test out the issue but will work on it later;
Before further notice, please consider STOP using the code for the purpose of reproducing the results in the paper, as mentioned in the README note, it will probably fail; Nevertheless, the overall idea is still valid for potential future work with stronger LLMs. Sorry again for the inconvenience!

@girlsending0
Copy link

I found how to fix this problem.

In eval_decoding.py file,

predictions=tokenizer.encode(predicted_string) this code should be changed to predictions=tokenizer.encode(predicted_string[0])

predicted_string is list, so we put the only string.
In our case, we set the batch size=1, so we fix the predicted_string => predicted_string[0]

This code fix the below problem:
corpus BLEU-1 score: 0
corpus BLEU-2 score: 0
corpus BLEU-3 score: 0
corpus BLEU-4 score: 0

{'rouge-1': {'r': 0.0960104371521744, 'p': 0.13671808632706614, 'f': 0.10633835733307583}, 'rouge-2': {'r': 0.011719396402741052, 'p': 0.013988694184239035, 'f': 0.01133032845861094}, 'rouge-l': {'r': 0.09090843088332022, 'p': 0.12862700453138184, 'f': 0.10046980133298505}}

to (in my case, )

corpus BLEU-1 score: 0.11137150833175373
corpus BLEU-2 score: 0.02308700455643944
corpus BLEU-3 score: 0.0057795258674805845
corpus BLEU-4 score: 0.0018112469683353798

But in my case, BrainTranslator model generate the only one sentence..

I am doing research with the author's code. We will update in the future if there are any corrections.

Thanks to the @MikeWangWZHL .

@underkongkong
Copy link

I found how to fix this problem.

In eval_decoding.py file,

predictions=tokenizer.encode(predicted_string) this code should be changed to predictions=tokenizer.encode(predicted_string[0])

predicted_string is list, so we put the only string. In our case, we set the batch size=1, so we fix the predicted_string => predicted_string[0]

This code fix the below problem: corpus BLEU-1 score: 0 corpus BLEU-2 score: 0 corpus BLEU-3 score: 0 corpus BLEU-4 score: 0

{'rouge-1': {'r': 0.0960104371521744, 'p': 0.13671808632706614, 'f': 0.10633835733307583}, 'rouge-2': {'r': 0.011719396402741052, 'p': 0.013988694184239035, 'f': 0.01133032845861094}, 'rouge-l': {'r': 0.09090843088332022, 'p': 0.12862700453138184, 'f': 0.10046980133298505}}

to (in my case, )

corpus BLEU-1 score: 0.11137150833175373 corpus BLEU-2 score: 0.02308700455643944 corpus BLEU-3 score: 0.0057795258674805845 corpus BLEU-4 score: 0.0018112469683353798

But in my case, BrainTranslator model generate the only one sentence..

I am doing research with the author's code. We will update in the future if there are any corrections.

Thanks to the @MikeWangWZHL .

I can't find this code in this project: predictions=tokenizer.encode(predicted_string)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants