Semcon · sf-pear · Mar 14, 2024 · Mar 14, 2024 · Mar 14, 2024 · Mar 18, 2024
diff --git a/examples/fiftyone-visualization/demo.ipynb b/examples/fiftyone-visualization/demo.ipynb
@@ -0,0 +1,354 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Visualizing ZOD in FiftyOne\n",
+    "\n",
+    "The ZOD is a large multi-modal autonomous driving dataset collected over a 2-year period in 14 different European countries, designed to support various aspects of autonomous driving research. This script facilitates the conversion of ZOD into FiftyOne format, enabling users to leverage FiftyOne's powerful tools for dataset exploration and analysis.\n",
+    "\n",
+    "In this notebook we will instructions on how to use the [Zenseact Open Dataset (ZOD)](https://zod.zenseact.com) with [FiftyOne](https://voxel51.com), a powerful open-source tool for computer vision dataset exploration and analysis. \n",
+    "\n",
+    "![](zod-fiftyone.gif)\n",
+    "\n",
+    "## Requirements\n",
+    "\n",
+    "Please install the packages in `requirements.txt` before getting started:\n",
+    "```\n",
+    "zod==0.3.6 \n",
+    "fiftyone==0.23.2\n",
+    "open3d==0.17.0\n",
+    "pyyaml==6.0.1 \n",
+    "tqdm==4.66.1\n",
+    "```\n",
+    "\n",
+    "Once the required packages are installed, we will check for any existing datasets."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import fiftyone as fo\n",
+    "\n",
+    "fo.list_datasets()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If this list is empty, we will start by importing ZOD into the fiftyone by creating a dataset. For this we\n",
+    "\n",
+    "## Create a new dataset\n",
+    "\n",
+    "With the requirements already installed, we have to:\n",
+    "\n",
+    "### 1. Update [`config.yaml`](config.yaml)\n",
+    "\n",
+    "Specify the necessary configurations for your dataset. Here's a breakdown of the configuration options:\n",
+    "\n",
+    "- `dataset_root`: Point this to the root directory of the ZOD dataset, if you haven't downloaded it yet, see the download options [here](https://zod.zenseact.com/download/).\n",
+    "- `pcd_files_dir`: Directory where converted PCD files will be stored. If it does not already exist, it will be created.\n",
+    "- `dataset_version`: Specify whether you're using the \"mini\" or \"full\" version of the dataset.\n",
+    "- `dataset_split`: Choose the dataset split (\"train\", \"val\", or \"all\"). For the \"mini\" version, please use \"all\".\n",
+    "- `dataset_name`: The name you want to give to your dataset.\n",
+    "- `dataset_persistent`: Set to True if you want to save the dataset locally for later use (recommended).\n",
+    "- `test_run`: Set to True for a test run on the first 10 samples.\n",
+    "- `mapbox_token`: Optional. Provide a [Mapbox](https://www.mapbox.com) API token for map functionality in FiftyOne (you can create one for free [here](https://account.mapbox.com/access-tokens/create)).\n",
+    "\n",
+    "### 2. Run [`zod_to_fiftyone.py`](zod_to_fiftyone.py)\n",
+    "\n",
+    "Please note that due to a FiftyOne limitation, we need to create `.pcd` files to visualize point clouds in the app. This conversion process will take a while due the size of the dataset."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load the dataset\n",
+    "\n",
+    "Once you have created the datasets, choose an existing one from the list to load it. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import fiftyone as fo\n",
+    "\n",
+    "fo.list_datasets()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset = fo.load_dataset('zod_val')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Launch the app\n",
+    "\n",
+    "You can use the app inside a notebook, but also launch in your browser. Learn more [here](https://docs.voxel51.com/user_guide/app.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "session = fo.launch_app(dataset)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Add new field to each sample\n",
+    "\n",
+    "This will take a few minutes to run depending on the size of the dataset."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "for sample in dataset.iter_samples(progress=True):\n",
+    "    sample[\"split\"] = \"Validation\"\n",
+    "    sample.save()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset.media_type"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can read more about grouped datasets in the FiftyOne documentation [here](https://docs.voxel51.com/user_guide/groups.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset.stats(include_media=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Get random sample\n",
+    "\n",
+    "This is useful to see the structure of your samples."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "sample = dataset.take(1).first()\n",
+    "sample"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# lists the names of the fields in the sample\n",
+    "sample.field_names"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Get coordinates for random samples"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset.take(5).values(\"location.point\", _raw=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Filtering\n",
+    "\n",
+    "You can easily filter the samples in your dataset in the app GUI, but you can also do it programmatically. Here are some examples.\n",
+    "\n",
+    "### by label"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from fiftyone import ViewField as F"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "just_animals = dataset.filter_labels(\n",
+    "    \"detections\",\n",
+    "    F(\"label\").contains_str(\"Animal\"),\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can also count how many samples exist in the filtered view we just created. You can read more about views in FiftyOne [here](https://docs.voxel51.com/user_guide/using_views.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# count of animal labels in the filtered view\n",
+    "just_animals.count(\"detections\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### by file path"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset.match(F(\"filepath\").contains_str(\"india\"))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Saving a view"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# will save a view with all samples that contain an animal\n",
+    "dataset.save_view(\"detections_animals\", just_animals)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Export view as KITTI dataset\n",
+    "\n",
+    "You can export data in many common formats (YOLO, KITTI, COCO, etc.) as well as having the option to build [custom exporters](https://docs.voxel51.com/user_guide/export_datasets.html#custom-formats). Here is an example of exporting a view as a KITTI object detection dataset."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "\n",
+    "export_dir = \"./kitti\"\n",
+    "label_field = \"detections\"\n",
+    "\n",
+    "os.makedirs(export_dir, exist_ok=True)\n",
+    "\n",
+    "just_animals.export(\n",
+    "    export_dir=export_dir,\n",
+    "    dataset_type=fo.types.KITTIDetectionDataset,\n",
+    "    label_field=label_field,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This is just a preview on how FiftyOne can be used to interact with the ZOD dataset programmatically. For more information please refer to their [documentation](https://docs.voxel51.com)."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "zod",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/examples/fiftyone-visualization/docs/zod-fiftyone.gif b/examples/fiftyone-visualization/docs/zod-fiftyone.gif
diff --git a/examples/fiftyone-visualization/requirements.txt b/examples/fiftyone-visualization/requirements.txt
@@ -0,0 +1,5 @@
+zod==0.3.6 
+fiftyone==0.23.2
+open3d==0.17.0
+pyyaml==6.0.1 
+tqdm==4.66.1
diff --git a/examples/fiftyone-visualization/src/config.yaml b/examples/fiftyone-visualization/src/config.yaml
@@ -0,0 +1,8 @@
+dataset_root: "/mnt/zod" # point this to the root directory of the zod dataset you are using
+pcd_files_dir: "/mnt/zod/pcd_files" # directory where converted pcd files will be stored
+dataset_version: "full"  # "mini" or "full"
+dataset_split: "val" # "train", "val", or "all"
+dataset_name: "zod_val" # you can change this to anything you would like
+dataset_persistent: True # if true, dataset will be saved locally for later use
+test_run: True # if true, it will run for only the first 10 samples
+mapbox_token: "" # optional, provide a Mapbox API token for map functionality in FiftyOne.