Final week task list #27

sofstef · 2022-03-13T17:20:21Z

Here's the list of tasks left to do by Friday (a few could be left for later). I suggest everyone writes which ones they're taking on and then I'll add a name next to the task and you can tick it when done. Use the comments to extend the list as needed and I can keep editing this issue.

Model training

Model evaluation

FAIR tasks:

Repository organising

Rename data_loading into datasets and apply changes everywhere
Move all helper functions in data_loading into utils.py in the datasets folder

Report writing

1. Get everyone access to the Overleaf doc @sofstef
2. Make report outline with all relevant sections; some suggestions: - @graceebc9 to take
- - Intro w lit review - @ThomasDodd97
- - Datasets -@graceebc9
- - Methodology -HC / SS
- - Results w discussion - HC / SS
- - Conclusion & any further steps - @graceebc9
3. Decide who writes what

graceebc9 · 2022-03-13T18:58:10Z

I'll note down some next steps for the reports here as I go

Figures for the appendix/ data sources:
@ThomasDodd97 please could you add a legend to the landcover plot you created with the mosaic and save that image to the github under report / figures?

UPDATE - ive pulled a figure from the classification report from the landcover with the classes so no need to create this :)

graceebc9 · 2022-03-13T21:17:23Z

@Hamish-Cam - please could you generate one of the modis fire map plots without the blue dotted box? I've deleted the folder with modis data so cant re run your script.

Hamish-Cam · 2022-03-14T12:16:59Z

Model Training

num_workers effects how quickly the CPU loads the data into the GPU => its value only effects the speed/efficiency of the GPU: https://chtalhaanwar.medium.com/pytorch-num-workers-a-tip-for-speedy-training-ed127d825db7. Given that the threading error seen is in the rasterio library and so is by no means (at least easily) fixable, I suggest we continue to take a speed hit, given that it means we can run bug free.
Martin agrees that for now burn_prop=1 for all cases. Since this isn't necessarily a long term solution, I won't alter the code so that we have the flexibility to change it if we wish.
Similarly, we are interested in the ability of our model to predict fires, not lack of fires. Therefore, I believe we should be using the same balanced sampler for val/test as for training (Martin agrees). As such I have pushed changes to remove the grid_sampler option (which was causing issues anyway) and instead use the constrained sampler for val/test when balance_samples = True.

Hamish-Cam · 2022-03-14T12:19:41Z

Martin has also suggested trying to overfit our model to one/two samples (by repeating training on these) to see if it can train to predict these fires.

He has also suggested we try a 'class balance cross entropy loss function' which would penalise the non-prediction of a fire more than the prediction of one when there is no fire. This may help with our model just predicting no fires.

sofstef · 2022-03-14T12:23:47Z

@Hamish-Cam re choice of loss: we are currently using jaccard loss which is suited for tackling our type of problem with very few pixels having fire. I have also added an option to use focal tversky loss, which is essentially a generalisation of Jaccard and allows for choosing weight parameters to penalise false negatives more + has an additional parameter which can be used to force the network to focus on pixels where it's struggling to make predictions. Will try the tversky loss out now and push the code so you can use it in colab!

Hamish-Cam · 2022-03-14T12:29:55Z

@Hamish-Cam re choice of loss: we are currently using jaccard loss which is suited for tackling our type of problem with very few pixels having fire. I have also added an option to use focal tversky loss, which is essentially a generalisation of Jaccard and allows for choosing weight parameters to penalise false negatives more + has an additional parameter which can be used to force the network to focus on pixels where it's struggling to make predictions. Will try the tversky loss out now and push the code so you can use it in colab!

Brill, sounds like this is pretty well covered then. Trying to overfit might still be a good test to run. Thanks!

graceebc9 · 2022-03-14T15:02:01Z

Proposed for who writes what:

Intro / lit review: Thomas
Datasets - grace
Methodology - Hamish or Sofija
Results - Hamish or Sofija
Conclusion & further - Grace

graceebc9 · 2022-03-16T20:06:38Z

had an issue with training on sentinel - note that 'his seems related to the following bug reports. Basically, the UNet that comes with SMP requires images with patch_size divisible by 32. Can you try switching from 250 to 256 and see if that solves your issue? -- switching to 256 solved the issue.

sofstef assigned sofstef, ThomasDodd97, Hamish-Cam and graceebc9 Mar 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Final week task list #27

Final week task list #27

sofstef commented Mar 13, 2022 •

edited by graceebc9

graceebc9 commented Mar 13, 2022 •

edited

graceebc9 commented Mar 13, 2022

Hamish-Cam commented Mar 14, 2022

Hamish-Cam commented Mar 14, 2022

sofstef commented Mar 14, 2022

Hamish-Cam commented Mar 14, 2022

graceebc9 commented Mar 14, 2022

graceebc9 commented Mar 16, 2022

Final week task list #27

Final week task list #27

Comments

sofstef commented Mar 13, 2022 • edited by graceebc9

Model training

Model evaluation

FAIR tasks:

Repository organising

Report writing

graceebc9 commented Mar 13, 2022 • edited

graceebc9 commented Mar 13, 2022

Hamish-Cam commented Mar 14, 2022

Hamish-Cam commented Mar 14, 2022

sofstef commented Mar 14, 2022

Hamish-Cam commented Mar 14, 2022

graceebc9 commented Mar 14, 2022

graceebc9 commented Mar 16, 2022

sofstef commented Mar 13, 2022 •

edited by graceebc9

graceebc9 commented Mar 13, 2022 •

edited