Skip to content

Jacques is a python package to detect useless images within a directory using artificial intelligence.

License

Notifications You must be signed in to change notification settings

IRDG2OI/jacques

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ACKNOWLEDGEMENT

This project is being developed as part of the G2OI project, cofinanced by the European Union, the Reunion region, and the French Republic.

Jacques : a cleaner for your Seatizen sessions

Jacques is a python package to detect useless images within a directory using artificial intelligence. The model that detects useless images has been trained on photos acquired with the Seatizen protocol. Have a look to the seatizen acquisition protocol here :

DOI Python package

Installation

Prerequisites

To install jacques in your working environnement, you should already have installed a pytorch (and torchvision) version adapted to your machine resources. By following the link below, you can find the optimal configuration to install pytorch in your working environnement. Make sure to install the last version available.

https://pytorch.org/

Jacques installation

Jacques can be installed by executing the following lines in your terminal:

git clone https://gitlab.ifremer.fr/jd326f6/jacques.git
cd jacques
pip install .

💡 If you are working in an environnement, don't forget to pip install ipykernel to make your environnement visible in your favourite IDE.

Datarmor user : jacques is already installed for you in the jacques_env environnement. You firstly need to append a line at the end of a conda text file as follow:

echo '/home/datawork-iot-nos/Seatizen/conda-env/jacques_cpu' >> ~/.conda/environments.txt

Just connect to the jupyterlab IDE and select jacques_env environnement to execute your notebooks. If you don't see jacques_env, you might not be part of the Seatizen group and should ask access to one of the members. For your first use of Jacques you will need to download a resnet manually from the terminal (only for datarmor users):

wget https://download.pytorch.org/models/resnet50-11ad3fa6.pth
mv resnet50-11ad3fa6.pth ~/.cache/torch/hub/checkpoints/resnet50-11ad3fa6.pth

Quickstart

👨‍🎓 All the tutorials (notebooks) are available here : Jacques examples

The checkpoint to load the classification model is available here : DOI As soon as you execute the code, if the latest checkpoint's release has not been already downloaded, it will be downloaded.

Classify images in one directory

👨‍🎓 Tuto : classify one directory

To classify a folder of images, you can execute the script below in a python script or a notebook:

from jacques.inference import predictor
results = predictor.classify_useless_images(folder_path='/path/to/your/image/folder', download=False, ckpt_path='/path/to/your/checkpoint/')

Jacques will automatically selects the files that are images in your folder and predict the utility of the image thanks to deep learning. It will return a pandas dataframe with 3 columns : directory, name and label (useless or useful). Here is an example of the results provided by classify_useless_images() :

dir image class
0 /path/to/your/image/folder/ session_2018_03_04_kite_Pointe_Esny_G0032421.JPG useful
1 /path/to/your/image/folder/ session_2018_03_12_kite_Le_Morne_G0029296.JPG useful
2 /path/to/your/image/folder/ session_2022_10_20_aldabra_plancha_body_v1A_00_1_399.jpeg useless
3 /path/to/your/image/folder/ session_2021_01_13_Hermitage_AllRounder_image_001196.jpg useful
4 /path/to/your/image/folder/ session_2019_09_20_kite_Le_Morne_avec_Manu_G0070048.JPG useful

Classify images in several directories

👨‍🎓 Tuto : classify multiple directories

To classify images contained in several directories just make a list containg the paths to your directories and execute the following codes:

from jacques.inference import predictor
import pandas as pd

list_of_dir = ['path/to/dir/1/', 'path/to/dir/2/', 'path/to/dir/3/']

results_of_all_dir = pd.DataFrame(columns = ['dir', 'image', 'class'])

for directory in list_of_dir:
    results = predictor.classify_useless_images(folder_path=directory, download=False, ckpt_path='/checkpoint/path.ckpt')
    results_of_all_dir = pd.concat([results_of_all_dir, results], axis=0, ignore_index=True)

Classify multiple Seatizen sessions all at once

👨‍🎓 Tuto : classify multiple Seatizen sessions

For Seatizen sessions that follows the famous and unique directory tree (written below), you can directly classify images of these sessions.

Seatizen tree (accepted in 02/2023) : 
session_YYYY_MM_DD_location_device_nb
│
└───DCIM
│   │
│   └───IMAGES
│   └───VIDEOS
│   
└───GPS
└───SENSORS
└───METADATA
└───PROCESSED_DATA
│   │
│   └───BATHY
│   └───FRAMES
│       │   session_YYYY_MM_DD_location_device_nb1.jpeg
│       │   session_YYYY_MM_DD_location_device_nb2.jpeg
│       │   ...
│   └───IA
│   └───PHOTOGRAMMETRY
│
session_YYYY_MM_DD_location_device_nb
│ ...

Use the following code lines to classify a Seatizen tree:

from jacques.inference import predictor
import os
import pandas as pd

list_of_sessions = ['/path/to/session_YYYY_MM_DD_location_device_nb', '/path/to/session_YYYY_MM_DD_location_device_nb']

results_of_all_sessions = pd.DataFrame(columns = ['dir', 'image', 'class'])
for session in list_of_sessions:
    results = predictor.classify_useless_images(folder_path=os.path.join(session, '/PROCESSED_DATA/FRAMES/'), download=False, ckpt_path='/checkpoint/path')
    results_of_all_sessions = pd.concat([results_of_all_sessions, results], axis=0, ignore_index=True)

Move images

Once the classification of useless and useful images has been done, you can choose to copy or paste images in another directory. If working with the seatizen tree, follow the method in the tuto provided to move images keeping the name of the session as a subdirectory.

from jacques.inference import output

output.move_images(results,
           dest_path = '/destination/path/directory/to/move/images/',
           who_moves = 'useless',
           copy_or_cut = 'cut'
           )

Display results (optionally)

Optionally, you can display some or all of the results using display_results() and the results dataframe returned by classify_useless_images().

from jacques.inference import output

output.display_predictions(results, image_nb=5)

Export results (optionally)

Results are returned as a pandas dataframe. For instance, if you want to export the results in a csv format just add :

results.to_csv('path/to/export/results.csv', index = False, header = True)

Have Fun!