Kitti2TFrecords

Multiprocessing script for conversion of KITTI dataset to Tensorflow records.

This script is to extract and encode data related to pointcloud, image, calibration and label information as tfrecords. One tfrecord includes 128 samples and has size of around 320 MB. You can specify the number of samples in one tfrecord and you can choose the number of training samples to be encoded. All configuration parameters of the script:

Kitti2TFrecords/kitti2tfrecords.py

Lines 10 to 18 in 7979b1e

    
           _SOURCE_FOLDER = "/data/datasets/KITTI" 
        
           _DESTINATION_FOLDER = "/data/datasets/KITTI_tfrecords-cropped-lw" 
        
           _TRAINING_SUBFOLDERS = ["velodyne", "label_2", "image_2", "calib"] 
        
           _TESTING_SUBFOLDERS = ["velodyne", "image_2", "calib"] 
        
           _FRAMES_PER_TFRECORD = 128 
        
           _SAMPLES_FOR_TRAINING = 3713 
        
           _CONVERT_TESTING_SAMPLES = True 
        
           _OBJ_TYPE_MAP = {"Car": 1, "Pedestrian": 2, "Cyclist": 4} 
        
           _VISIBLE_GPUS = [0]

The 3D bounding box elements are encoded in this order:

[x, y, z, length, width, height, rotation]

The order can be easily modified in the function get_bbox3d().

The converted dataset can be loaded using TFRecord dataset.
An example/test function convert_from_tfrecords() is also available to parse the encoded tfrecord data.

Usage

Download the dataset from https://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d

Velodyne point clouds (29 GB): Used as input laser information
Training labels of the object data set (5 MB): Used as input label
Camera calibration matrices of the object data set (16 MB): For synchronizing images and point clouds, for cropping point clouds, for projection from camera to point cloud coordinate, for visualizing the predictions
Left color images of the object data set (12 GB): For cropping point clouds, for projection from camera to point cloud coordinate, for visualizing the predictions

Unzip the files into a folder. Write their directories into the kitti2tfrecord.py and run the script to convert the dataset into TF records!

NOTE: You can crop the point cloud data using this script, because the point clouds are scanned in 360 degrees while the RGB cameras are not (they have a much narrower field of view). In addition, KITTI only provides labels for objects that are within the images. Therefore, we usually need to remove points outside the image coordinates. If you convert the cropped data, then one tfrecord will be around 125 MB.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
_config.yml		_config.yml
index.html		index.html
kitti2tfrecords.py		kitti2tfrecords.py
tfrecordsutils.py		tfrecordsutils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

_config.yml

_config.yml

index.html

index.html