Sand Mining Watch

Deep learning methodology to detect sand mines [work in progress].

As the major ingredient of concrete and asphalt, sand is vital to economic growth and will play a key role in aiding the transition to a low carbon society. However, excessive and unregulated sand mining in the Global South has high socio-economic and environmental costs, and amplifies the effects of climate change. Sand mines are characterized by informality and high temporal variability, and data on the location and extent of these mines tends to be sparse. We provide a custom sand-mine detection tool by fine-tuning foundation models for earth observation, which leverage self supervised learning - a cost-effective and powerful approach in sparse data regimes. These tools allow for real-time monitoring of sand mining activity and can enable more effective policy and regulation.

Datasets

We have acquired data (latitude, longitude, timestamp) on sand mining activities across 21 different river basins across India, through a partnership with Veditum India Foundation. Currently, these data cover 39 distinct mining sites; we expect to expand this to over 100 sites over the course of our study. We extract image patches (ranging in size from 2.5 sq.km to 582 sq.km) from freely available Sentinel-2 multi-spectral and Sentinel-1 synthetic aperture radar imagery around visually recognizable sand mining footprints at each site3. A majority of Indian rivers are characterized by high average flood discharges and large temporal variability, leading to huge intra-annual variation in sand deposition rates and mining footprints. We consider these changes to be strong natural label augmentations (figure above, inset 1). This allows us to obtain multiple labels (of arbitrary size) for each location that represent the seasonal lifecycle of sand mines. While sub-meter resolution imagery (figure above, inset 2) captures more precise information on mining activity, we believe that 10m imagery will prove to be an effective feature set since it captures broad patterns of importance (i.e. scarring, pitting and flooding) at high temporal & spectral resolution.

System Design

The system diagram is shown above. It consists of a data generation stage (upper half) and a data modeling stage (lower half).

Install conda enviroment

conda env create -f environment.yml

Structure of the repository

label/ contains the labeling pipeline:

observation_selector.ipynb exports Sentinel-1/2 data from Google Earth Engine to Google Cloud Platform (GCP) Storage
create_labelbox_dataset.ipynb populates Labelbox dataset with pointers (URLs) to GCP
export_annotations.ipynb exports annotations from Labelbox as GeoJSONs to GCP
aoi_generator.ipynb populates GCP with coordinates of river boundaries with buffer

train_eval/train_eval.ipynb trains and evaluates models.

inference/inference.ipynb runs predictions on dataset without annotations.

project_config.py holds configuration that is valid for the entire project.

Objects of the class SupervisedTrainingConfig (defined in experiment_configs/schemas.py) hold configuration for a single training run.

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
dataset		dataset
experiment_configs		experiment_configs
inference		inference
label		label
ml		ml
models		models
notebooks_misc		notebooks_misc
pretrain		pretrain
testing_notebooks		testing_notebooks
train_eval		train_eval
utils		utils
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
environment.yml		environment.yml
project_config.py		project_config.py
split.py		split.py

License

BerkeleySandProject/sandmining-watch

Folders and files

Latest commit

History

Repository files navigation

Sand Mining Watch

Datasets

System Design

Install conda enviroment

Structure of the repository

About

Resources

License

Stars

Watchers

Forks

Languages