Skip to content

Py-Contributors/dataset-convertor

Repository files navigation

website title image

👉 Convert object detection dataset format 👈

Dataset types

PASCAL VOC: Pascal voc dataset have a xml file for each image.

YOLO: YOLO dataset have a txt file for each image.

COCO: COCO dataset have a json file for each image.

Current support format

Currently, the following formats are supported:

from to implemented
PASCAL VOC YOLO(TXT files) Yes
YOLO PASCAL VOC (XML files) Yes

Upcoming support format

from to Issue/PR(if any)
PASCAL VOC COCO (JSON files) No
PASCAL VOC TFRecord (TFRecord files) No
COCO PASCAL VOC (XML files) No
COCO YOLO (TXT files) No
COCO TFRecord (TFRecord files) No
YOLO COCO (JSON files) No
YOLO TFRecord (TFRecord files) No

Installation

Installation from source code

git clone https://github.com/codePerfectPlus/dataset-convertor/
cd dataset-convertor
python -m venv venv
source venv/bin/activate
pip install requirements.txt

Installation from PyPI

pip install dataset-convertor

Usage

convert annotations from one format to another format.

dataset formatting example:

- data/pascal_voc/JPEGImages/*.jpg
- data/pascal_voc/Annotations/*.xml

- data/yolo5/JPEGImages/*.jpg
- data/yolo5/labels/*.txt

Pascal VOC(xml) to yolo(txt)

from convert import Convertor

con = Convertor(input_folder='/home/user/data/pascal_voc', output_folder='/home/user/data/yolo5')
con.voc2yolo()

from yolo(txt) to Pascal VOC(xml)

from convert import Convertor
con = Convertor(input_folder='/home/user/data/yolo5', output_folder='/home/user/data/pascal_voc')
con.yolo2voc()

Contributing

create an issue/PR if any format is missing.Open-source contribution is welcome.check the contributing guide for details.

Reference

License

Authors