Comparison in MIMIC-CXR dataset #16

Markin-Wang · 2023-09-13T13:47:43Z

Hi, thanks for your work.

I have a question about the comparison to previous works in MIMIC-CXR dataset.

Previous methods in report generation utilized the official MIMIC-CXR data split to report the report generation results.

Nonetheless, your work uses the Chest ImaGenome v1.0.0 data split which is different from the MIMIC-CXR data split.

Therefore, rgrg in report generation experiments seems not comparable to previous works?

I am grateful if you could provide more information on this and sorry if I misunderstand the testing procedure.

ttanida · 2023-09-16T11:45:20Z

Hi,

Thank you for your question.

You're right in noting that we utilized the Chest ImaGenome v1.0.0 split instead of the MIMIC-CXR split. However, since both splits come from the same underlying dataset, they should inherently have a similar data distribution, ensuring the comparability of our results with previous studies.

Best,
Tim

Markin-Wang · 2023-09-16T20:58:33Z

Hi,

Thank you for your question.

You're right in noting that we utilized the Chest ImaGenome v1.0.0 split instead of the MIMIC-CXR split. However, since both splits come from the same underlying dataset, they should inherently have a similar data distribution, ensuring the comparability of our results with previous studies.

Best, Tim

Hi Tim,

Thank you for your reply.
However, I respectfully disagree with you claim as the dataset split in MIMIC-CXR seems not random. The data distribution seems a bit different from the training and validation set. For example, in the paper releasing the dataset, they mentioned that "The test set contains all studies for patients who had at least one report labelled in our manual review." In addition, as shown in Table 3, only ~69% patients in the training /val set have the findings, while this figure is 98.3% in the test set. Moreover, the average length of report on the MIMIC-CXR split test set is 66.4 while 53 and 53.05 in the training and validation set as shown in paper.

fuying-wang · 2023-10-17T06:40:59Z

Thanks for the awesome work!

I have also noticed that previous splits contain lateral view images, which may also make the data distribution slightly different.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison in MIMIC-CXR dataset #16

Comparison in MIMIC-CXR dataset #16

Markin-Wang commented Sep 13, 2023

ttanida commented Sep 16, 2023

Markin-Wang commented Sep 16, 2023 •

edited

fuying-wang commented Oct 17, 2023 •

edited

Comparison in MIMIC-CXR dataset #16

Comparison in MIMIC-CXR dataset #16

Comments

Markin-Wang commented Sep 13, 2023

ttanida commented Sep 16, 2023

Markin-Wang commented Sep 16, 2023 • edited

fuying-wang commented Oct 17, 2023 • edited

Markin-Wang commented Sep 16, 2023 •

edited

fuying-wang commented Oct 17, 2023 •

edited