This folder contains scripts that enable an operator or analyst to run data- and model-related SkyScan functions from a script rather than a series of Jupyter notebooks.
What do these scripts do? They largely implement machine learning-related functionality so that a SkyScan user can build trained models that do plane detection and classification. Like many ML application, much of the code is related to data preparation rather than simply model training or inference.
First, run the Docker container.
sudo docker exec -it ml-model_jupyter_1 /bin/bash
cd scripts
Next, install required libraries.
python install.py
To get help, type:
python main.py --help
There are a number of different functionalities that can be called from the script. Each functionality requires any necessary input values to be provided via the config.ini configuration file.
To prepare the data for analysis, first provide a 'dataset_name' value and an 'image_directory' file location value in the configuration file. Then run:
python main.py --prep
This command creates a voxel51 dataset based on the plane images you provide and enriches the plane data with information from publicly-accessible FAA data.
Because the character string identifying the plane model can vary widely even for the same plane model, this command attempts to create a standardized model identifier for each plane.
python main.py --normalize
Edit the plane_model_dict.json file to add additional normalizations.
To upload training or evaluation images to Labelbox for manual labeling, use the appropriate command below.
python main.py --upload_train
python main.py --upload_eval
Users must provide in the config their Labelbox API key, the Labelbox dataset name, the Labelbox project name, and the name of the local dataset to be uploaded. The user will first need to create a Labelbox account, project, and dataset.
Similar to the command above but in the event that the upload is disrupted or paused. Use one of these commands:
python main.py --resume_upload_train
python main.py --resume_upload_eval
The same configuration arguments as above are used.
After using Labelbox to do hand annotation, you then then merge the annotations with the Voxel51 dataset. First, download the labels from Labelbox in a JSON format. Then run:
python main.py --download
The configuration file must contain values for the local Voxel51 dataset name and also the path of the JSON exported from Labelbox.
Train a deep learning model to do detection of plane objects.
python main.py --train
The configuration file must contain the dataset_name, the model's training_name, the base_model, and the num_train_steps.
Export the trained deep learning model. The TF Saved Model format will be used.
python main.py --export_model
The configuration file must contain the dataset_name, the model's training_name, and the model's base_model.
This will run the plane detection model against all of the images in the Voxel51 dataset. A new label will be created with the results.
python main.py --predict
The configuration file must contain the dataset_name, the model's training_name, and the prediction_field.
This is similar to --predict, except a windowing approach will be used. This approach chops up the larger image into smaller tiles to improve detection of small objects.
python main.py --predict_tiled
This command use the results from running the airplane detection model and pulls in the aircraft model information from the FAA to create a dataset of detected aircraft labeled with specific model information.
python main.py --build_multi_class_dataset
*Note: there are a couple different approaches for constructing the dataset. They vary on how unique the aircraft bodies are between the Train and Test portions of the dataset. Check out the customvox51.py to see the different functions, and update main.py if there are other ones you want to use. *
If you would like to use the YOLO notebooks to train a multi-class object detection model, you can export a YOLO version of the dataset:
python main.py --export_yolo_dataset
This will train a model that can detect different models of aircraft. After the model has finished training, you can export the model and then run one of the prediction commands to have it ran against all of the images.
python main.py --train_multi_class
If you wanted to see how well the multiclass aircraft model detector works, this command will run it against the Test set of images that were reserved when the dataset was created. Graphs and a confusion matrix will be generated in the /dataset-export folder.
python main.py --evaluate
To help the user gain a sense of potential command sequences that could be useful, we provide one example below.
First, enter the Docker container.
sudo docker exec -it ml-model_jupyter_1 /bin/bash
Next, install required dependencies.
pip install -r requirements.txt
After entering required values in the configuration file (e.g. 'dataset_name' = 'test' and 'image_directory' = 'foo'), run the command:
python main.py --prep
The perform normalization on the plane model data.
python main.py --normalize
After creating a Labelbox account, API key, project name, and dataset name and then entering required values in the configuration file (e.g. api_key = 'password123', "labelbox" dataset_name = 'labelbox_data', project_name = 'labelbox_project', "filenames" dataset_name = 'test'), run a command to upload the normalized data to Labelbox.
python main.py --upload_train
After performing labeling in Labelbox, export the results as a JSON. After entering required values in the configuration file (e.g "file_names" "dataset_name"= 'test', exported_json_path = 'foo/bar.json'), then merge the labels into the existing voxel51 dataset. Run the following command:
python main.py --download
After entering required values in the configuration file (e.g. 'dataset_name' = 'test', training_name = 'test_model', base_model = 'efficientdet-d0', and num_train_steps = 40000), then run this command to train a model:
python main.py --train
pytest