Skip to content

ir-lab/LanguagePolicies

Repository files navigation

Language-Conditioned Imitation Learning for Robot Manipulation Tasks

This repository is the official implementation of Language-Conditioned Imitation Learning for Robot Manipulation Tasks, which has been accepted to NeurIPS 2020 as spotlight presentation.

Model figure

When using this code and/or model, we would apprechiate the following citation:

@inproceedings{NEURIPS2020_9909794d,
 author = {Stepputtis, Simon and Campbell, Joseph and Phielipp, Mariano and Lee, Stefan and Baral, Chitta and Ben Amor, Heni},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin},
 pages = {13139--13150},
 publisher = {Curran Associates, Inc.},
 title = {Language-Conditioned Imitation Learning for Robot Manipulation Tasks},
 url = {https://proceedings.neurips.cc/paper_files/paper/2020/file/9909794d52985cbc5d95c26e31125d1a-Paper.pdf},
 volume = {33},
 year = {2020}
}

Inddex

  1. Environment Setup
  2. Quick Start
  3. Results
  4. Changelog

Environment Setup

Local Setup

Our code is tested on Ubuntu 22.04 with Python 3.10 (Note, Ubuntu 20.04 requries python 3.8). At this time, running our code on MacOS or Windows is not supported, but may work. In order to set up our code create a new conda environment as follows:

sudo apt install libcurl4-openssl-dev libssl-dev libeigen3-dev python3-dev mesa-utils libgl1-mesa-glx

To install Python requirements (this is for Ubuntu 22.04 and Python 3.10):

conda env create -f environment.yml 

This will set up a basic environment named "lp". The reason for the python version is that if you compile KDL from scratch, it uses the system python, which is 3.10 for Ubuntu 22.04. While not provided, you should be able to run this code on other Linux versions. If you are not using the setup described above, you may need to re-compile the protobuf files, which can be done via the compile.sh script in utils/proto. For the additional setup, please active that environment

conda activate lp

Further modules are needed and need to be manually installed:

  • CoppeliaSim: Downloading and installing the player version will be sufficient, as long as you do not want to change the simulation environment itself. Our code was tested with version 4.1 (On Ubuntu 22.04, download the 20.04 version which seems to be working).

After downloading and extracting CoppeliaSim, you will need the following environment variables set (please replace the path accordingly)

export COPPELIASIM_ROOT=/<path>/<to>/CoppeliaSim_Edu_V4_1_0_Ubuntu20_04
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT

Then, install the remaining dependancies:

  • PyRep: Please clone their repository and check out commit 96c0b034ee21ab5e6ba0942c4d57993a8670379a. Then, the package can be installed with: pip install . (note the "." at the end). You can test if PyRep is set up correctly with a small script found in utils/test_pyrep.py

  • Orocos KDL: The python-wrapper has to match the solver version installed on your system. We strongly suggest to install both components from the git repository. Our code was tested with commit 86c7893234aeccec3b9bf24cf20de9380d64bdf3. Please follow their installation instructions for orocos_kdl and python_orocos_kdl. However, after running make in the python package, you should have created a file called PyKDL.so. Copy this file to your python interpreter's site-packages. You can check if KDL is ready to be used by running the code in utils/test_KDL.py.

To run the model and simulation, you need to download the dataset, pre-trained model, and other required files. The required files can be downloaded from here. The downloaded file contains a pre-trained model, the processed training dataset (and other supporting files), and the test-data used for evaluation. The downloaded file should be placed next to the root folder of this repository. The folder LanguagePolicies and the extracted GDrive should reside in the same directory. Additionally we provide our full raw data as an optinoal download here (Note: 6GB download and ~100GB extracted)

Quick Start

A detailed description of the training and evaluation process can be found on our Details: Training and Evaluation page. If you are interested in collecting data, please refer to our Details: Data Collection page.

Training

To train the model with default parameters, run the following command in this repository's root directory.

python main.py

The trained model will be located in Data/Model, and TensorBoard logs will be in Data/TBoardLog. Overall, training will take around 35 hours, depending on your hardware. A GPU is not required, and our model has been trained on a node with two Intel Xeon CPU E5-2699A v4 @ 2.40GHz. Please note that the usage of a GPU is not necisarrily beneficial due to the intricacies of our model. However, nothing prevents the usage of GPUs and the performance will depend on your specific hardware.

Training parameters can be found in lines 25-37 of the main.py file and are customizable according to your preferences. The parameters currently set in the file were utilized during the training of our model, which resulted in the reported findings in the paper. It is important to note that a few random initializations, or training runs, may fail to converge. While this occurrence is uncommon, it can be detected early on in the training process if the attention loss does not converge at all. If this happens, restart the training using a different seed. Since the seeds are random, simply restarting the training process should suffice.

Observin the Training Process

You can observe the training progress via TensorBoard. You should be able to start it with the following command:

tensorboard --logdir ./Data/TBoardLog

In TensorBoard, you can observe various training metrics, including the losses for all auxiliary tasks. Specifically, the training progress of the Attention module can be visualized through the provided bar plot, which offers an intuitive representation. The green bar represents the ground-truth target, while the blue bars indicate the likelihood of the predicted target object. This visualization should begin displaying the correct tendency after approximately 15 epochs of training.

Another noteworthy metric is the Trajectory plot, illustrating the trajectory across the robot's seven degrees of freedom (DoF) (six DoF for the robot and one for the gripper). The ground-truth motions are depicted by green lines, while the blue lines represent the robot's motions when utilizing the policy.

The final plot presents the predicted phase, indicating the progression of the task. The green-dashed line marks the point at which the ground-truth trajectory completed the task (the remaining trajectory is a result of padding), whereas the blue-dashed line represents the policy's prediction of task completion.

By default, the training is configured for 200 epochs. However, it is worth noting that usable policies are typically attained around the 150-epoch mark.

Evaluation

Our model can be live-evaluated in CoppeliaSim. To run the evaluation, ROS2 is required. Please start by building the ros2 workspace and source it. First, the pre-trained model will be loaded from the GDrive directory and provided as a service with

python service.py

After the service has been started, the model can be evaluated in the simulator with

python val_model_vrep.py

This will create a file val_result.json after ten evaluation runs (Results in our paper are from 100 runs. This value can be changed). Results can be printed in the terminal by running.

python viz_val_vrep.py

Results

We summarize the results of testing our model on a set of 100 unseen, new environments. Our model's overall task success describes the percentage of cases in which the cup was first lifted, and then successfully poured into the correct bowl. This sequence of steps was successfully executed in 84% of the new environments. Picking alone achieves a 98% success rate while pouring results in 85%. The Detection rate indicates the success rate of the semantic model, attempting to identify the correct objects. Content-In-Bowl outlines the percentage of material that was delivered to the correct bowl during the pouring action. Finally, we report the mean-absolute-error of the robot's joint configuration. These results indicate that the model appropriately generalizes the trained behavior to changes in object position, verbal command, or perceptual input. In additon, we also compared the models performance to a simple RNN approach and a recent state-of-the-art baseline ("Pay attention!-robustifying a deep visuomotor policy through task-focused visual attention" Abolghasemi et. al.):

Model Picking Pouring Sequential Detection Content-In-Bowl MAE (Joints, Radiant)
Simple RNN 58% 0% 0% 52% 7% 0.30°
PayAttention! 23% 8% 0% 66% 41% 0.13°
Ours 98% 85% 84% 94% 94% 0.05°

Further results can be found in our Additional Results page.

An execution of our model in a specific environment is shown below. First, the languaage command Rais the green cup and an image of the current environment is given to the model. This allows the robot to identify the target object in the current environment, as well as and desired action. After the cup has been picked up, a second comand Fill all of it into the small red bowl is issued and processed in the same environment. In addition to identify the target bowl and action (the what and where), the robot also identifies a quantity modifier, used to describe how the robot should execute the described task. In this case, all of the cup's content is filled into the target bowl.

More examples can be found in the Additional Examples

Contributing

If you would like to contribute or have any suggestions, feel free to open an issue on this GitHub repository or contact the first author of this work!

All contributions welcome! All content in this repository is licensed under the MIT license.

Changelog

The following additions were made:

  • May 2023
    • Updated the code to work with recent versions of various depending libraries
      • Updated KDL version
      • Updated PyRep version
      • Updated to Python 3.10
      • Updated to Ubuntu 22.04
      • Updated scene file to work with newer versions of CoppeliaSim
    • Added a link to the full dataset used for training
    • Removed the Docker version of this repo as it is very outdated
    • Removed the dependency on ROS and replaced it with gRPC as it is much more lightweight
    • Provided further details for training the model
  • November 2020
    • Initial releas

About

Supplemental code for our NeurIPS 2020 paper.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published