In Affordance Learning for End-to-End Visuomotor Robot Control, we introduced a modular deep neural network structure, that detects a container on a table, and inserts a ball into it.
We showed that our system performs its task successfully in zero-shot sim-to-real transfer manner. Each part of our system was fully trained either with synthetic data or in a simulation. The system was invariant to, e.g., distractor objects and textures.
We have didvided our work into the following code blocks:
- AffordanceVAED extracts affordance information from an observation image, and represents it as a latent space vector. Figure 1 shows the structure of the model.
- BlenderDomainRandomizer generates a domain randomized dataset for VAED.
- TrajectoryVAE represents trajectories in a low-dimensional latent space, and generates a trajectory based on a given latent vector.
- affordance_gym generates training data for TrajectoryVAE, and combines VAED and TrajectoryVAE together to perform desired trajectories based on an observation.
We used Blender, an open-source 3D computer graphics software, to generate a domain randomized dataset for affordance detection. Figure below shows examples that are generated by our method. The environment was designed for a robotic learning task, where the task of a robot arm was to insert a ball into a cup that is located on a table.
This project was built with with Blender 2.79. Blender Download.
To install other depedencies run pip install -r requirements.txt
.
env.blend contains all the environment meshes (walls, table, lights, clutter, and cup).
The texture and bum map trees are already built in env.blend.
To examine the Blender environment (env.blend) run blender env.blend
.
To generate affordance samples run blender -b env.blend --python src/main.py
.
To see parameter options run blender -b env.blend --python src/main.py -- --tips
.
Include option variables after --
.
We used Blender's Python API to generate a domain randomized dataset for affordance detection. The Blender Renderer engine was used for rendering. Each sample includes a rendered RGB-D image and a respective affordance image of the mug. Clutter objects are located on the table and their affordances are ignored.
The following features were randomized:
- Position of the objects on the table, and the number of lights and their positions (table_setting.py)
- Textures and materials of the environment objects, and bump map of the walls (texture_random.py)
- The number of clutter objects on the table and their sizes (random_objects.py)
- Shape of the mug (design.py)
- Position, and orientation of the Camera (camera_position.py)
Positions, scales, and rotations of the objects were uniformly sampled. In total, 66 clutter objects were used. A simple cylinder model was built in Blender to represent a mug. The cylinder model was divided into an inner part and outer part. The outer part has a wrap-grasp affordance and the inner part has a contain affordance. Its shape was randomly generated by smoothly changing its diameter along the height.
Blender's Node Editor provides a structural method to combine multiple texture transformation operations together. We utilized it to randomize the textures and the bump maps of the scene.
Figures below show the node structure of the texture randomizer:
Figure 2: Texture randomizer. Two Blender's texture models, Checker and Distorted Noise, were merged. The initial pattern of Checker and Distorted Noise were modified with size and distortion values. The two colors of the texture models were given by the ColorRamp nodes. Fac values of the ColorRamp nodes determine produced colors. Scaling, Translate, Rotate, and Darken operations modified the merged texture. The transformation and color values of the nodes were modified through Blender's Python API for each sample. In addition, Diffuse reflection, Transfluecny, and Emit values of the objects were randomized.
Figure below shows the bum map node tree structure, which was built similarly as the texture randomizer:
To obtain affordance labeled images, textures of the objects were switched to correspond to their affordances. The affordance textures were generated with Blender's Node Editor. A class id was determined for each object. For the clutter objects, the walls, the floor and the table, the class id was 0. For outer part of the mug, the class id was 1 and the inner part was 2. Figure below shows the structure of the node tree.