GitHub - zensors/droplet: A JSON-based format for working with machine learning data, with a focus on data interoperability.

A JSON-based format for working with machine learning data, with a focus on data interoperability.

About

Droplet is a JSON-based format for serializing and operating on machine learning annotations. The primary goal of the format is to provide a structured way to specify an annotation to facilitate data interoperability and code reuse.

Though developed by Zensors for dataTap, Droplet is an open format and free for anyone to use.

Why Droplet?

Right now, every dataset and machine learning project uses its own data format. This drastically slows down the advancement of the field as more and more time is spent converting data between formats rather than working on algorithm improvements that drive real growth. Moreover, with today's fragmented world of machine learning data formats, it's impossible to bring together data from multiple sources without spending time porting over the data. Further, machine learning tools need to either be aware of all of these formats, or be rewritten from the ground up every time a new dataset emerges.

Droplet is a standard format that solves this problem. By normalizing to a single format, we can:

Pull in data from multiple sources – since every dataset is in the same format, you can mix and match annotations from different data sources with no effort.
Have a standardized set of tools – code written for one project will work for every project.
Spend more time focusing on algorithms – no more time needs to be spent on writing yet another set of one-off conversion scripts.

Example

Droplet

Preview

Show droplet

{
  "kind": "ImageAnnotation",
  "image": {
    "paths": [
      "http://images.cocodataset.org/train2017/000000541901.jpg",
      "http://farm6.staticflickr.com/5310/5619616662_c1e5b34bd3_z.jpg"
    ]
  },
  "classes": {
    "car": {
      "instances": [
        {
          "boundingBox": {
            "rectangle": [[0.1778,0.0741],[0.4367,0.1315]]
          },
          "segmentation": {
            "mask": [
              [
                [0.1801,0.1047],
                [0.1881,0.0868],
                [0.2129,0.0875],
                [0.2478,0.0741],
                [0.3313,0.0823],
                [0.3751,0.1062],
                [0.4129,0.1129],
                [0.4307,0.1241],
                [0.4367,0.1315],
                [0.3989,0.1286],
                [0.1778,0.1069],
                [0.1856,0.1023]
              ]
            ]
          }
        },
        {
          "boundingBox": {
            "rectangle": [[0.8004,0.1322],[0.9991,0.1779]]
          },
          "segmentation": {
            "mask": [
              [
                [0.8187,0.1648],
                [0.9991,0.1779],
                [0.9967,0.1569],
                [0.9613,0.1386],
                [0.8662,0.1322],
                [0.8382,0.1423],
                [0.8040,0.1395],
                [0.8004,0.1569]
              ]
            ]
          }
        },
        {
          "boundingBox": {
            "rectangle": [[0.4938,0.0892],[0.7335,0.1539]]
          },
          "segmentation": {
            "mask": [
              [
                [0.5872,0.1420],
                [0.4938,0.1324],
                [0.4995,0.1087],
                [0.5168,0.0892],
                [0.5571,0.0926],
                [0.5979,0.0959]
              ],
              [
                [0.6120,0.0985],
                [0.6391,0.1011],
                [0.6463,0.1063],
                [0.6671,0.1226],
                [0.6873,0.1268],
                [0.6844,0.1493],
                [0.6053,0.1437]
              ],
              [
                [0.6957,0.1300],
                [0.7140,0.1333],
                [0.7291,0.1374],
                [0.7335,0.1400],
                [0.7311,0.1539],
                [0.6942,0.1509]
              ]
            ]
          }
        },
        {
          "boundingBox": {
            "rectangle": [[0.0000,0.0598],[0.0648,0.0936]]
          },
          "segmentation": {
            "mask": [
              [
                [0.0000,0.0598],
                [0.0235,0.0748],
                [0.0611,0.0832],
                [0.0648,0.0926],
                [0.0623,0.0936],
                [0.0348,0.0926],
                [0.0035,0.0917],
                [0.0000,0.0607]
              ]
            ]
          }
        }
      ]
    },
    "person": {
      "instances": [
        {
          "boundingBox": {
            "rectangle": [[0.1411,0.3198],[0.7087,0.9617]]
          },
          "segmentation": {
            "mask": [
              [
                [0.4444,0.3198],
                [0.4835,0.3221],
                [0.5195,0.3311],
                [0.5405,0.3401],
                [0.5495,0.3649],
                [0.5495,0.4009],
                [0.5465,0.4234],
                [0.5676,0.4392],
                [0.5826,0.4527],
                [0.5405,0.4595],
                [0.5285,0.4640],
                [0.5195,0.4887],
                [0.5495,0.5315],
                [0.5766,0.5946],
                [0.6486,0.6734],
                [0.6787,0.6779],
                [0.7087,0.6847],
                [0.6937,0.7005],
                [0.6937,0.7433],
                [0.6637,0.7613],
                [0.6066,0.7838],
                [0.5676,0.7883],
                [0.5556,0.7815],
                [0.4414,0.6554],
                [0.3393,0.6599],
                [0.3784,0.7050],
                [0.4024,0.7523],
                [0.4324,0.8063],
                [0.4835,0.8649],
                [0.4955,0.9077],
                [0.5345,0.9234],
                [0.5285,0.9505],
                [0.5135,0.9617],
                [0.4745,0.9617],
                [0.4054,0.9414],
                [0.3484,0.8739],
                [0.2763,0.7387],
                [0.1832,0.6802],
                [0.1411,0.6441],
                [0.1592,0.5495],
                [0.2222,0.4572],
                [0.3153,0.4009],
                [0.3514,0.3851],
                [0.3874,0.3716]
              ]
            ]
          }
        },
        {
          "boundingBox": {
            "rectangle": [[0.4468,0.0199],[0.4716,0.0353]]
          },
          "segmentation": {
            "mask": [
              [
                [0.4468,0.0305],
                [0.4501,0.0265],
                [0.4526,0.0219],
                [0.4549,0.0199],
                [0.4601,0.0199],
                [0.4641,0.0216],
                [0.4660,0.0240],
                [0.4687,0.0282],
                [0.4712,0.0318],
                [0.4716,0.0341],
                [0.4658,0.0353],
                [0.4528,0.0344],
                [0.4489,0.0344],
                [0.4476,0.0327],
                [0.4506,0.0318],
                [0.4487,0.0307]
              ]
            ]
          }
        }
      ]
    }
  }
}

Note: this image and annotation are reproduced from the COCO dataset.

Specification

The Droplet specification has a few pieces that describe both the format and some features of its usage. These are split into the following categories:

Concepts
Common Types
- Geometry
Annotation Types
- Image Annotation

Using Droplet

In order to use droplet, you'll probably want to use bindings for whatever langauge you're using. Right now, bindings for the following languages exist:

Python: datatap-python

Contributing

The current version of Droplet only has specific facilities for detection-style annotations on individual images, though other varieties of annotations (e.g., textual descriptions, relationships) and annotations on other subjects (e.g, video) will be added in the future.

Have an idea for how to improve Droplet? We're looking for community contributions to help the format succeed! Just open an issue with your idea!

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
annotations		annotations
assets		assets
common		common
LICENSE		LICENSE
README.md		README.md
concepts.md		concepts.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

annotations

annotations

assets

assets

common

common

LICENSE

LICENSE

README.md

README.md

concepts.md

concepts.md

Repository files navigation

About

Why Droplet?

Example

Specification

Using Droplet

Contributing

About

License

zensors/droplet

Folders and files

Latest commit

History

Repository files navigation

About

Why Droplet?

Example

Specification

Using Droplet

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks