Train/Validate/Test Clarification #97

lcaronson · 2021-08-30T18:59:10Z

Hello,

I just have a couple of clarifying questions to ask.

I am a little bit unclear how you come up with final 0.9544 Dice coefficient value in the published MIScnn paper. Is there some kind of additional function that can be used to compare the test data to predictions? Or is that value returned during the cross-validation phase?

If the cross-validation phase is also doing the testing of the data, then how do we define the ratio of train/validate/test data? For example, my understanding is that of the roughly 300 studies in the KiTS19 dataset, you did 80/90/40 ratio? I am just trying to figure out how you set these parameters in the code.

As a final question for you, if I have a dataset of 60 studies, would a decent train/validate/test ratio be 30/15/15?

muellerdo · 2021-09-08T18:32:17Z

Hey @lcaronson,

thanks for your interest in using MIScnn!

I am a little bit unclear how you come up with final 0.9544 Dice coefficient value in the published MIScnn paper. Is there some kind of additional function that can be used to compare the test data to predictions? Or is that value returned during the cross-validation phase?

The DSC of 0.9544 for the kidney segmentation were automatically computed with our cross-validation function (https://github.com/frankkramer-lab/MIScnn/blob/master/miscnn/evaluation/cross_validation.py). With default parameters -> without any callbacks, no validation monitoring is performed by which the returning cross-validation set can be used as testing sets.

However, you can always run a prediction call by yourself and then compute the associated DSCs.
You can find an example for this approach in our CellTracking example with 2D microscopy images: https://github.com/frankkramer-lab/MIScnn/blob/master/examples/CellTracking.ipynb

If the cross-validation phase is also doing the testing of the data, then how do we define the ratio of train/validate/test data? For example, my understanding is that of the roughly 300 studies in the KiTS19 dataset, you did 80/90/40 ratio? I am just trying to figure out how you set these parameters in the code.

For the KiTS19, we didn't used any validation set and computed our scores purely on the testing sets from the 3-fold cross-validation. Also we only used a subset of 120 samples. -> 3x (80 train & 40 test)
We did this to demonstrate a default approach with any more advanced validation monitoring techniques.

However, in our more recent covid-19 segmentation based on limited data, we used a cross-val (train/val) and testing strategy.
https://www.sciencedirect.com/science/article/pii/S2352914821001660?via%3Dihub

In this study, we performed a 5-fold cross-validation on only 20 samples and computed 5 models (each fold returning a model).
Then, we computed predictions on a completely separated hold-out set of 100 samples (from another source). We computed the 5 predictions for each samples (from each fold-model one and then averaged these 5 predictions into a single one = ensemble learning). Afterwards, we computed the DSC on the ensembled/final predictions.
In the paper, we did also some more fancy stuff to prove that the ensemble learning strategy is highly efficient and MIScnn is capable to produce robust models with it based on even such a low number of samples like 20.
Here the complete COVID-19 study code: https://github.com/frankkramer-lab/covid19.MIScnn

As a final question for you, if I have a dataset of 60 studies, would a decent train/validate/test ratio be 30/15/15?

Sadly, there is no clear answer to this question. Personally, I would highly recommend for a 80/20 percentage split into train/test and then run a 3-fold or 5-fold cross-validation on the 80% training data. For testing then utilizing ensembling learning techniques. This is the state-of-the-art approach and will gain great performance.
Otherwise, I'm a personal fan of running a 65/15/20 split for train/val/test. It is highly depended on how much data you have. 60 samples are quite low in terms of neural networks (even if it's a very good dataset in a medical perspective due to the complexity to generate annotated medical imaging datasets!!), which is why I'm a big fan of cross-validation.

Hope that I was able to give you some insights/feedback! :)

Cheers,
Dominik

emmanuel-nwogu · 2023-05-09T08:08:51Z

then averaged these 5 predictions into a single one

Hi, please how did you average these predictions into one? Did you just average the metrics computed from the 5 predictions for each sample?

muellerdo · 2023-05-09T11:25:32Z

Hey @emmanuel-nwogu,

correct. In this study, we just averaged the predictions pixelwise via mean.

Cheers,
Dominik

emmanuel-nwogu · 2023-05-10T03:52:02Z

Thanks for the reply. From my understanding, you average the predicted binary masks to generate a final prediction mask. Is there a common name for this in the literature?

muellerdo · 2023-05-10T09:55:46Z

Happy to help! :)

Absolutely correct!

Sadly, to my knowledge, there is no community-accepted name for functions to combine predictions originating from ensemble learning.
Most of the time you read about ensemble learning in biomedical image classification or segmentation, the authors applied averaging via mean and also just call it averaging.

Last year, we published an experiment analysis about ensemble learning in medical image classification, in which I called the combination methods to merge multiple predictions pooling functions and the averaging as mean (which can be either unweighted as well as weighted).
Check out here: https://ieeexplore.ieee.org/document/9794729/

Hope this explains/helps a little bit on general ensemble learning in biomedical image analysis.

Best Regards,
Dominik

emmanuel-nwogu · 2023-05-10T10:54:18Z

Thanks, I'll check it out. :)

muellerdo self-assigned this Sep 8, 2021

muellerdo added the question Further information is requested label Sep 8, 2021

muellerdo added the information label May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train/Validate/Test Clarification #97

Train/Validate/Test Clarification #97

lcaronson commented Aug 30, 2021

muellerdo commented Sep 8, 2021

emmanuel-nwogu commented May 9, 2023

muellerdo commented May 9, 2023

emmanuel-nwogu commented May 10, 2023

muellerdo commented May 10, 2023

emmanuel-nwogu commented May 10, 2023

Train/Validate/Test Clarification #97

Train/Validate/Test Clarification #97

Comments

lcaronson commented Aug 30, 2021

muellerdo commented Sep 8, 2021

emmanuel-nwogu commented May 9, 2023

muellerdo commented May 9, 2023

emmanuel-nwogu commented May 10, 2023

muellerdo commented May 10, 2023

emmanuel-nwogu commented May 10, 2023