Speed prediction uses 3D CNN model for video feature extraction and applies regression to predict the speed of frames
Video feature extraction repository
This 3D model is for video classification, then we modified it to apply it on speed regression as well as speed dataset
3DCNN result base on blackbox data which we collect by ourselves
- This code uses pytorch for training and testing. you may need "conda install pytorch==1.2.0"
- For loading Comma2k19 dataset, you need to use openpilot tools(https://github.com/commaai/openpilot-tools). This tool will reduce loading data time and cpu computation cost comparing to cv2.VideoCapture()
CUDA_VISIBLE_DEVICES=5 python3 main_speed_node3.py
- For speed prediction on Comma2k19 using 3D resnet video feature extraction, the resnet18 model currently obtained the best performance. After 11 epochs, resnet18 achieved 1.5 (m/s) mae error. we show the training and testing procedure in the following table:
- So far there are no algorithms applied yet on comma2k19, for comparing, we may need to look on other datasets. The paper "Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks" has shown that the best mae performance of udacity dataset and comma.ai dataset are 1.6(m/s) and 0.7(m/s), respectively.
This is leaderboard of "Learning to Steer by Mimicking Features from Heterogeneous Auxiliary Networks":
- These results show that the result of applying resnet18 on comma2k19 maybe not bad.