About Meteor metric #14

DavidYanAnDe · 2023-08-12T15:09:21Z

When I generate the generate_report and calcuate compute_NLG_scores, in score, _ = scorer.compute_score(ref_sents_or_reports, gen_sents_or_reports) to scores.append(float(self.meteor_p.stdout.readline().strip())), it comes out ValueError: could not convert string to float: b'29.0 50.0 14.0 18.0 5.0 5.0 8.0 8.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 3.0 4.0 2.0 12.0 18.0 18.0', and I use a few short sentences to calculate the Meteor score is calculated, I would like to know if it is caused by an outlier in the report I generated, have you encountered this problem during the experiment?

The text was updated successfully, but these errors were encountered:

ttanida · 2023-08-14T14:06:16Z

I haven't encountered this error. Can you post the exact reference and generated report that causes this error?

DavidYanAnDe · 2023-08-14T14:27:03Z

``> I haven't encountered this error. Can you post the exact reference and generated report that causes this error?

I found the error because of a garbled code in one of the reports I generated and I have isolated it. Also I have encountered the following error in my inference when it calculate the CE metric, have you encountered it, I would appreciate if you give me some advice.

`(rgrg_new) ydw@lthpc:~/code/rgrg$ cd /home/ydw/code/rgrg ; /usr/bin/env /home/ydw/miniconda3/envs/rgrg_new/bin/python /home/ydw/.vscode-server/extensions/ms-python.python-2023.14.0/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 43347 -- /home/ydw/code/rgrg/src/full_model/test_set_evaluation.py
[WARNING]: Parameter 'function'=<function get_tokenized_dataset..tokenize_function at 0x7f18dfceac10> of the transform datasets.arrow_dataset.Dataset._map_single couldn't be hashed properly, a random hash was used instead. Make sure your transforms and parameters are serializable with pickle or dill for the dataset fingerprinting and caching to work. If you reuse this transform, the caching mechanism will consider it to be different from the previous calls and recompute everything. This warning is only showed once. Subsequent hashing failures won't be showed.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 31271/31271 [01:06<00:00, 468.22ex/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1440/1440 [00:03<00:00, 478.60ex/s]
[INFO]: Test loader: computing scores for object detector, region selection and region abnormal module...
6198it [29:59, 3.46it/s]Premature end of JPEG file
7818it [37:47, 3.45it/s]
[INFO]: Test loader: computing scores for object detector, region selection and region abnormal module... DONE.
[INFO]: Test loader 2: computing scores for object detector, region selection and region abnormal module...
296it [01:25, 3.41it/s]Premature end of JPEG file
360it [01:44, 3.46it/s]
[INFO]: Test loader 2: computing scores for object detector, region selection and region abnormal module... DONE.
[INFO]: Test loader: generating sentences/reports...
6198it [3:58:26, 2.24s/it]Premature end of JPEG file
7818it [5:00:48, 2.31s/it]
[INFO]: Test loader: generating sentences/reports... DONE.
[INFO]: Test loader 2: generating sentences/reports...
296it [10:11, 1.83s/it]Premature end of JPEG file
360it [12:19, 2.05s/it]
[INFO]: Test loader 2: generating sentences/reports... DONE.
[INFO]: Computing language_model_scores...
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
{'testlen': 1946809, 'reflen': 1794729, 'guess': [1946809, 1914098, 1881387, 1848676], 'correct': [726420, 317612, 161599, 86599]}
ratio: 1.0847370271500594
Downloading: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 570/570 [00:00<00:00, 1.68MB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 420M/420M [00:35<00:00, 12.3MB/s]
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Downloading: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 226k/226k [00:00<00:00, 599kB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28.0/28.0 [00:00<00:00, 95.4kB/s]

Tokenizing report impressions. All reports are cut off at 512 tokens.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32711/32711 [00:30<00:00, 1084.37it/s]

Begin report impression labeling. The progress bar counts the # of batches completed:
The batch size is 18
0%| | 0/1818 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using tokenizers before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
5%|████████▏ | 99/1818 [04:09<1:12:14, 2.52s/it]
Traceback (most recent call last):
File "/home/ydw/code/rgrg/src/full_model/test_set_evaluation.py", line 891, in
main()
File "/home/ydw/code/rgrg/src/full_model/test_set_evaluation.py", line 887, in main
evaluate_model_on_test_set(model, test_loader, test_2_loader, tokenizer)
File "/home/ydw/code/rgrg/src/full_model/test_set_evaluation.py", line 730, in evaluate_model_on_test_set
language_model_scores = evaluate_language_model_on_test_set(model, test_loader, test_2_loader, tokenizer)
File "/home/ydw/code/rgrg/src/full_model/test_set_evaluation.py", line 424, in evaluate_language_model_on_test_set
language_model_scores = compute_language_model_scores(gen_and_ref_sentences, gen_and_ref_reports)
File "/home/ydw/code/rgrg/src/full_model/evaluate_full_model/evaluate_language_model.py", line 505, in compute_language_model_scores
compute_report_level_scores()
File "/home/ydw/code/rgrg/src/full_model/evaluate_full_model/evaluate_language_model.py", line 437, in compute_report_level_scores
compute_clinical_efficacy_scores(language_model_scores, gen_reports, ref_reports)
File "/home/ydw/code/rgrg/src/full_model/evaluate_full_model/evaluate_language_model.py", line 316, in compute_clinical_efficacy_scores
preds_gen_reports, preds_ref_reports = get_chexbert_labels_for_gen_and_ref_reports()
File "/home/ydw/code/rgrg/src/full_model/evaluate_full_model/evaluate_language_model.py", line 194, in get_chexbert_labels_for_gen_and_ref_reports
preds_gen_reports = label(chexbert, csv_gen_reports_file_path)
File "/home/ydw/code/rgrg/src/CheXbert/src/label.py", line 71, in label
batch = batch.to(device)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.`

ttanida · 2023-08-14T15:01:43Z

I found the error because of a garbled code in one of the reports I generated and I have isolated it.

Is it correct to assume that you didn't use the pre-trained weights I provided, but trained your own model? Since my model didn't generate any report that caused a ValueError.

Also I have encountered the following error in my inference when it calculate the CE metric, have you encountered it, I would appreciate if you give me some advice

I haven't encountered this error either. I asked GPT-4 and it recommended the following:

The error you're encountering, RuntimeError: CUDA error: device-side assert triggered, indicates that there's an error in the CUDA code that's being executed on the GPU. CUDA errors are usually hard to debug because they might be triggered asynchronously, meaning that the location of the error might be different from where it's reported.

Here are some steps you can take to debug this error:

Set CUDA_LAUNCH_BLOCKING Environment Variable:
Before running your code, set the CUDA_LAUNCH_BLOCKING environment variable to 1. This will make CUDA operations synchronous, which should give you a more accurate error location.
```
export CUDA_LAUNCH_BLOCKING=1
```
Then run your code again. With this setting, the stack trace should point you more directly to the problematic line of code.
Check Tensor Sizes and Types:
Often, the error is caused by trying to perform operations on tensors that don't have compatible sizes or types. Ensure that all the tensors being used in the operations leading up to the error have expected sizes and types.
Review Model and Input Device Compatibility:
Ensure that both the model and the input are on the same device (i.e., either both on the GPU or both on the CPU).
```
print(chexbert.device)
print(batch.device)
```
Check the Batch Content:
There might be something wrong with the data in the batch that's causing the CUDA error. It might be helpful to print out the batch data or inspect it before the line batch = batch.to(device).
CUDA Memory:
Although it's not directly indicated by the error message, out-of-memory issues can sometimes manifest as cryptic errors. Monitor the GPU memory usage when running the code. You can use tools like nvidia-smi for this.
Update PyTorch and CUDA:
Sometimes, errors are due to bugs in the PyTorch or CUDA libraries themselves. Ensure you're using the latest stable version of PyTorch and CUDA. Also, consider checking related forums or GitHub issues to see if others have encountered (and possibly solved) similar issues.
Isolation Testing:
Isolate the part of the code causing the error. If you can reproduce the error in a simpler setup or smaller script, it becomes much easier to debug.
Check External Libraries or Modules:
You're using an external module (CheXbert). Ensure that this module is compatible with the CUDA and PyTorch versions you're using.
PyTorch Forums and GitHub Issues:
If you're still unable to determine the issue, consider seeking help from the PyTorch forums or looking into relevant GitHub issues. The community might provide insights or solutions.

By systematically following these steps, you should be able to narrow down the root cause of the error and hopefully find a solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Meteor metric #14

About Meteor metric #14

DavidYanAnDe commented Aug 12, 2023

ttanida commented Aug 14, 2023

DavidYanAnDe commented Aug 14, 2023 •

edited

ttanida commented Aug 14, 2023

About Meteor metric #14

About Meteor metric #14

Comments

DavidYanAnDe commented Aug 12, 2023

ttanida commented Aug 14, 2023

DavidYanAnDe commented Aug 14, 2023 • edited

ttanida commented Aug 14, 2023

DavidYanAnDe commented Aug 14, 2023 •

edited