Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to merge several classes to dataset scripts #156

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

joaqo
Copy link
Contributor

@joaqo joaqo commented Feb 15, 2018

Adds the ability to merge classes with the --merge-classes option. Very useful when used with --only-classes to create datasets for detecting certain type of objects. For example, a dataset for detecting 4 wheeled vehicles that merges car, bus and truck from the coco dataset.

Could also be useful without the --only-classes option to train the network to behave as a sort of more discriminative RPN.

Copy link
Contributor

@vierja vierja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be implemented in all the remaining readers (openimages, csv, etc).

Maybe replace self.classes with a dict and when merging all values are 0? Or just use a new parent method?

Now when there are merged classes the class.json file, which is used for
labeling predicted videos/pics, will be empty. This fixes a problem in
which all predictions generated from a dataset that had merged classes
would be labeled with the label of the first class.

Eg.: If you generated a dataset based on coco that filtered all classes
except car, bus, and truck, and that also merged these classes into a
single class. The predictions generated by predict.py or the web server
would be labeled as car, when in fact they could've been a truck or a
bus too.

The idea behind completely removing the label instead of picking a new
one is that a model that was trained to predict a single type of object,
would portray this information more globally, instead of on an per
object basis. For example in the name of the .jpg or .mpg file it
created, or something like this.

Still, there are probably use cases in which it would be useful to let
the user pick the label. This could be added in the future after
choosing a good console argument name for this parameter, and seeing if
we could somehow merge it with the --merge argument while also
maintaining the ability to have the label be empty.
@@ -67,16 +68,18 @@ def transform(dataset_reader, data_dir, output_dir, splits, only_classes,
# All splits must have a consistent set of classes.
classes = None

merge_classes = merge_classes in ('True', 'true', 'TRUE')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use click types instead.

@@ -30,16 +30,17 @@ def get_output_subfolder(only_classes, only_images, limit_examples,
@click.option('--data-dir', help='Where to locate the original data.')
@click.option('--output-dir', help='Where to save the transformed data.')
@click.option('splits', '--split', required=True, multiple=True, help='Which splits to transform.') # noqa
@click.option('--only-classes', help='Whitelist of classes.')
@click.option('--only-classes', multiple=True, help='Whitelist of classes.')
@click.option('--merge-classes', help='Merge all classes into a single class')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we name it --single-class or --merge-all to make the fact that only a single class creatd more explicit?

json.dump(self._reader.classes, tf.gfile.GFile(classes_file, 'w'))
if self._reader.merge_classes:
# Don't assign a name to the class if its a merge of several others
json.dump([''], tf.gfile.GFile(classes_file, 'w'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have an option to set the class name before merging the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants