Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training images with zero annotations #43

Open
NiklasWilson opened this issue Nov 25, 2019 · 43 comments
Open

Training images with zero annotations #43

NiklasWilson opened this issue Nov 25, 2019 · 43 comments
Labels
enhancement New feature or request wontfix This will not be worked on

Comments

@NiklasWilson
Copy link

What would be the best way to modify the code to support images with zero annotations?
When you export the annotations as csv from Vott it makes a new folder with All the images you checked but it leaves images without annotations absent from the csv. Which means the training script will simply ignore them.

I was thinking the "Convert_to_YOLO_format.py" would be the best place to add in the images without annotations by making it check the folder for all the images that exist and than checking to see if they are referenced in the csv and if not add a row without bounding boxes.

Do you think this would be the best way? or will the training script fail if an images with no bounding boxes is provided in the dataset?

@NiklasWilson NiklasWilson changed the title Traing images with zero annotations Training images with zero annotations Nov 25, 2019
@AntonMu
Copy link
Owner

AntonMu commented Nov 25, 2019

Hi Niklas - I think we would need to make sure that they are explicitly labeled as not containing the object. Because what will happen more often than not is that there are 1000s of images in the image folder and after labeling a few hundred, someone might stop labeling and then the way you describe it, the program would automatically assume that all remaining images don't have the object.

What about making an extra folder somewhere called "No_Object" in Training_Images and then simply add all files in there to the data_train.txt the way you described it.

@NiklasWilson
Copy link
Author

Adding a "No_object" folder would be good however by default VoTT puts the annotated csv as well the images that have been looked at into the csv_export folder. (By default it excludes images that have yet to be checked for tags)

It could be a lot of extra labor to manually move images to a "no_object" folder as I wont know they have no object till after I look at them in VoTT. I am thinking a flag could be added to the script that defaults to false like --check_for_no_annotations or something similiar/shorter.

Adding a no object folder as well that is checked by default would be cool though.

@NiklasWilson
Copy link
Author

Ok the folder path actually will work for both of us. I can just pass in the folder where the annotations are. PR coming up with the change, by default it will now check for a no_object folder.

@AntonMu
Copy link
Owner

AntonMu commented Nov 25, 2019 via email

@NiklasWilson
Copy link
Author

No problem. Check that out, obviously make sure it doesn't cause you any issues :)

It looks like it includes the last PR? Im not sure how to update a fork to resolve this, ive actually never done a PR across a fork before.

@forceassaulter
Copy link

@NiklasWilson How to do the image training with zero annotation, because I just want the machine to classify whether the object present in the image or not, bounding box is unnecessary.

@AntonMu
Copy link
Owner

AntonMu commented Dec 5, 2019

Hi @forceassaulter - there is some confusion here. @NiklasWilson is talking about improving the object detector by adding additional images without annotations. What you are looking for is image classification. That's a different algorithm. Check out Inception v3 for instance.

@forceassaulter
Copy link

forceassaulter commented Dec 5, 2019

Any good example for that?

@NiklasWilson
Copy link
Author

NiklasWilson commented Dec 5, 2019

@forceassaulter : @AntonMu Is correct there is another algorithm used for classification instead of detection. However if you willing to draw annotations around stuff, there is no reason why you cant use this code to do both, in the detection script instead of drawing boxes on images just use was it detected to classify the image. That said I believe there is a huge accuracy/efficiency loss if you were to use this algorithm for pure image classification.

My additional code to accept images with no drawn boxes/annotations is not the solution you want though. It works along side images with annotations.

@forceassaulter
Copy link

@NiklasWilson Other than image classification, I also need an algorithm to draw bounding boxes on object detected, so I was thinking to determine whether there is box drawn on image to determine whether the object is in the image.

@AntonMu
Copy link
Owner

AntonMu commented Dec 6, 2019 via email

@NiklasWilson
Copy link
Author

Yes. The drawing of the boxes is actually optional, afters it trained when you run the detection code it give you back a list/array of detections with the class, box coordinates and percentage of certainty. By default it draws every box but you could use this data to "classify" the image and draw boxes around some or all of the detections as you see fit.

@NiklasWilson
Copy link
Author

I personally have modified the detection code to run on live camera footage, with frame skipping that draws the same box across multiple changing frames aka the footage stays real time while the detection only updates every few frames.

@AntonMu AntonMu added the enhancement New feature or request label Jan 15, 2020
@shahzaibraza37
Copy link

@NiklasWilson how did you add negative images in the data_train.txt file. I am looking for a solution for a long time but could not fine any. If I just try to give images name without class in the data_train.txt , it throws an error.

@NiklasWilson
Copy link
Author

@shahzaibraza37 I check a folder for any image files not mentioned in the annotations file that gets generated. Any that are missing I add to the annotations file. (I think, just did a quick review of my code change for that)

Here is the actual commit, it only modifies two files so it shouldn't be too hard to find.
It adds a '--no_object_folder' param to 'Convert_to_YOLO_format.py '
and it adds a 'zeroAnnotationPath' to 'Utils/Convert_Format.py'

The actual code change is in 'Utils/Convert_Format.py' which gets called from 'Convert_to_YOLO_format.py'

If you can't figure it out let me know maybe I can try to walk through it more.
Assuming this link is even public.
NiklasWilson@4d314d2

@shahzaibraza37
Copy link

Hi @NiklasWilson ,

So, I have changed the script as you mentioned and everything has worked for my data_train.txt file.
The problem is same! When I try to train my model it shows the same error as before:

FileNotFoundError: [Errno 2] No such file or directory: 'Y:/955-030/MA_Hafiz/Machine_learning/Training_set/Train_14_04/TrainYourOwnYOLO/Data/Source_Images/no_object/P8 (67).png \n

All the images are there in the no_object folder. Do you have any solution for it?
The error is appearing because there is no class and bounding box to these negative images. There should be a some simple way of using these negative images for training of the model.

@NiklasWilson
Copy link
Author

Remove spaces from all your file names.
Its literally saying that it is not seeing that file and in my experience it was related to spaces or slashes. different parts of the code handle the file path differently and some seem fine with it and others seems to break on it. (If you copied my code directly, it was only tested with files named a certain way) I just mass renamed every file.

@bertelschmitt
Copy link

bertelschmitt commented May 8, 2020

How about this: Let’s agree on a tag for an object that will be ignored by the model. That tag could be NONE. It would be used like this: The depicted concrete pad gives me problems, the model thinks it’s a cat named MrKuro. To stop the model from doing so, I would add a NONE tag to VOTT, I would tag that concrete pad a few times with NONE and feed it into my training set along with all the other pictures. Once trained, the model would simply ignore, not report, and not box-in any detected NONE objects.
This way, the NONE tag could be added to a variety of undesirable objects, no special folders would need to be created, and no workflow would be interrupted. The solution could be as simple as sticking a

if predicted_class != "NONE":

into the for i, c in reversed(list(enumerate(out_classes))): loop in yolo.py

falsekuro

@shahzaibraza37
Copy link

Hi @bertelschmitt @NiklasWilson,

I did not solve the problem of negative images because of the error
FileNotFoundError: [Errno 2] No such file or directory: 'Y:/955-030/MA_Hafiz/Machine_learning/Training_set/Train_14_04/TrainYourOwnYOLO/Data/Source_Images/no_object/P8 (67).png \n

Even though, there was not a single space in the name of non_object training images as suggested by @NiklasWilson.
What I did, I annotated the false results in the image as some other class and my results are getting better and better.

@bertelschmitt The type of solution you are suggesting will surely improve the results alot, because then there will be no extra/unnecessary classes.

@NiklasWilson
Copy link
Author

NiklasWilson commented May 8, 2020

@bertelschmitt
My setup doesn't use a separate folder. (I pass the the folder with the annotation images as my no_object folder) It checks the annotation file and only adds the entries that are absent... the option for the folder to be different was more of a formality.

I strongly discourage using no object tags. It will result in decreased accuracy when your no_objects are not present. Using images with no_object tags is not the same as images with no annotations. You will be training it to detect a singular object as a no_object which will will be very inaccurate with the more unique things you page with it. It will also waste a lot of processing detecting these.

@shahzaibraza37
This P8 (67).png is your issue. The code that handles file paths does not like the special characters. It breaks on that, even though the error is using the full path the thing that tried to read the path didn't. (filenames are part of the path)

@shahzaibraza37
Copy link

@NiklasWilson I meant to say, I changed the files name to e.g q.png but it still gave the same error as I told earlier.

@NiklasWilson
Copy link
Author

NiklasWilson commented May 8, 2020

look at the annotations file. I had this exact same issue originally. Its related to how the file paths are read from the annotations file. I don't remember the solution, you will need to find where that error is thrown, and add a bunch of print statement anywhere where a file path / filename is used to track dow the exact issue.

also make sure that file actually exists in both the annotations file and the folder....

@NiklasWilson
Copy link
Author

Double check the exact path that the error spits out is correct, manually follow the folders and confirm that it really matches a real file.

@bertelschmitt
Copy link

@NiklasWilson, not trying to argue, just trying to understand. What I propose changes nothing to the training and detection, it simply adds a very trivial post-processing step: YOLO will dutifully try to detect all objects in its model, except that when it finds one with a name chosen by myself (I chose “NONE”) that detection simply will be thrown out, and not reported up the chain. I currently have 13 objects in my model, and I wouldn’t think twice about adding a few more. So why should the addition of a 14th object called “NONE” wreak havoc with precision and lead to a pile of wasteful processing cycles? Please explain to me in greater detail where I’m wrong.

In any case, my solution involves one small change in detect_image in yolo.py (this is where the bounding boxes are added):

if predicted_class == "NONE":
    continue

That’s it.

It definitely is not a general solution, but it is one that could be helpful in certain narrow cases. After my model misfired depending on the position of the sun, and the shade projected by the house, I stuck the two lines into my code, I marked the offending parts of the image (in my case, a slab of concrete) as “NONE,” and peace ensued. The computer did not work harder, or less hard than before, but most of all, it works for me.

@shahzaibraza37
Copy link

@bertelschmitt I did the same thing with my case. It works perfectly. Now, I am trying to include the negative images also to see the effect of the negative samples on the results.

@NiklasWilson Hey. I tried to train the model again with zero annotations again and this time the error looks like this:
File "E:\Train\TrainYourOwnYOLO\2_Training\src\keras_yolo3\yolo3\utils.py", line 68, in get_random_data line = line[1].split(" ") IndexError: list index out of range

Do you have any recommendations?

@NiklasWilson
Copy link
Author

@bertelschmitt ok I rethought what your doing. The main issue would be if the no_object itself changes. Like I have 6 cameras on my house and a singular no object would not work for all 6 as that in it self would create false negatives but I guess if your no object was of a singular background it could work just fine. In this case I guess both methods would have the same outcome. The specific idea behind the more general zero_annotations is that it increases the accuracy of the already learned objects by effectively unlearning data/patterns that were never meant to be learned.

@shahzaibraza37 line[1] does not exist. So either "line" is null/undefined or it doesn't have 2 items in it. The .split(" ") Should work on an any string even if spaces are absent but it has to be at least an empty string.
Look at what ever sets line.
Word of advice (not trying to be insulting) adding print statements throughout can greatly help in debugging for instance print what line is pulling from than print it before and after the split statement. Anywhere were it might help give an idea of what is happening.

@shahzaibraza37
Copy link

@NiklasWilson Thanks for the advice. I have solve this issue by changing some lines in the 2_Training\src\keras_yolo3\yolo3\utils.py . Now I can negative images in my training data., Thanks for the help :)

@NiklasWilson
Copy link
Author

@shahzaibraza37 Awesome glad it worked out. So to clarify you got it working with images that have zero annotations? Have you noticed an improvement or are still working on actually getting to test it out?

@shahzaibraza37
Copy link

@NiklasWilson I am still testing it. But from the initial results, it looks like it will not have a huge impact on the results. But I will share my thoughts on this method after doing thorough tests.

@NiklasWilson
Copy link
Author

@shahzaibraza37 Ok. I think it really depends on the content. I like to test mine on video footage such as youtube videos. One I like watching it action and two this way I can see how it performs on data that I never considered on presenting it. I used the zero annotations to correct things it routinely thought matched. For instance when I was just playing around with it I make it detect weapons, it routinely thought dining chairs looked like handguns, after feeding it zero annotation images of dinging rooms that was no longer an issue.

@shahzaibraza37
Copy link

@NiklasWilson I will try to add more negative images to my training data. For now, I am using only 30 to 40 negative samples. As I have said, I should do more testing on this method. :)

@bertelschmitt
Copy link

Please define "negative image"
Where and how does one reference that negative image in training?

@shahzaibraza37
Copy link

@bertelschmitt You have to read the thread properly. Everything is cleared in this thread.
In short , make a no_object folder at TrainYourOwnYOLO\Data\Source_Images put the images/object which you do not want to detect in the no_object folder i.e. in your case a concrete slab which is falsely detecting during inference. Then just run everything like usually.

@NiklasWilson About the results. It improved my results greatly. I was not expecting the results to be that good after the addition of negative samples. Thanks again and cheers!

@bertelschmitt
Copy link

bertelschmitt commented May 19, 2020

@shahzaibraza37: Trust me, I must have read this thread at least 20 times, and I am still scratching my head. Maybe I’ve been born dense, and I have grown denser in my old age. So, let me try to break that down a bit. Please correct any erroneous assumptions.

Goal:

I assume we are attempting to assist the YOLO algorithm by providing information of what NOT to detect and/or classify. Correct? We do so by feeding the algorithm images with that “negative” information. Correct?

The negative images:

The thread refers to these images interchangeably as “negative sample images,” “negative images,” or as “images with zero annotations.”

My confusion:

I am trying to understand what these images are supposed to be.

Normally, when we talk about “negative images,” a photographer would think about something like this:

negative

I don’t think this is what we are trying to do here. Are we rather referring to that?

background

Are we attempting to feed the algorithm a “background image” with information to be ignored, so that the algo can focus on better detecting objects we want to find and classify?

cat

Are we trying to provide those background images to enable the algo to better isolate objects we are looking for?

cat_selected

After all, when we tagged the zillions of training images, the bounding boxes always contained parts of the background image, because we never tediously select the target clip like above. Are we trying to help the algo to get rid of the background clutter?
isolated

If that is the case, are there indeed facilities in the code that accept and process these unannotated/negative/background pictures? If so, how do we supply the algo with that information?

When training the model, the central file is data_train.txt. For each picture to be used for training, data_train.txt has a line that looks somewhat like this:

/TrainYourOwnYOLO/Data/Source_Images/Training_Images/vott-csv-export/catpic.jpg 422,0,553,349,0

422,0,553,349 would be the coordinates of the actual “training image” within that picture, followed by an index into data_classes.txt. If the first entry in data_classes.txt would be “Felix,” the 0 would tell the algo that the clip at 422,0,553,349 depicts Felix.

So far, so good. Now what?

What are we to put into data_train.txt when we refer to that mystical negative image?

Is that “negative image” again an image to be clipped from a larger image? Like that concrete slab?

background_tagged

In that case, data_train.txt would get a line looking like this:

/TrainYourOwnYOLO/Data/Source_Images/Training_Images/vott-csv-export/background.jpg 114,279,295,464,99 

… where 99 would point to a “Negative Image” tag.

However, this can’t be, after all, we are talking about “images with zero annotations.”’

So are we instead feeding the algo the whole image of my backyard (sans cats)?

background

If we could do this, it would be great, and it would be immensely better than my “NONE” hack.

If so, what is the line to put into data_train.txt? Is it this?

/TrainYourOwnYOLO/Data/Source_Images/Training_Images/vott-csv-export/catpic.jpg 

Would the algo know what to do with this? Will it understand it as “if you see this, never mind, focus on what’s not in that image?” This would be, as they say these days, awesome.

If that is the case, then we could address the gory details. Like, would it matter if details in that background image change? Like if the white sandals are dropped elsewhere? If the dish moves? If the blue door closes? If the sun throws hard shadows?

I know that's many questions. I guess the core question is: What goes into data_train.txt?

It is more likely that I completely mis-understood the matter. Please tell me then where I went wrong. Just don’t tell me I’m dense. I already know that.

@NiklasWilson
Copy link
Author

NiklasWilson commented May 19, 2020

@bertelschmitt I am by no means even close to an expert on this stuff. I am also not sure of the correct terminology. Doing some quick google searching "negative image sample" seems like the common term. I used the term "zero annotation" because we are literally feeding it images that have no annotated bounding boxes around anything, telling it that nothing in this image relates to any of our classes/objects.

In your case the ideal "negative image" would be the full image with none of the cats in it.
82310707-b4f6ba00-99ff-11ea-962e-5bf5851c31e3
Like the one above but I imagine you already have lots of positive images with that black blob where it should have learned to ignore that. So in this case I would also pick a bunch of random images and crop various sized images that include that black blob in them and use them all as negative images.

Like this.
82310707-b4f6ba00-99ff-11ea-962e-5bf5851c31e3
but I would do this in addition to images of the entire background as those still help improve the accuracy of the cat detection.

The data_train.text should look something like this when the image "background.jpg" has no annotations.

/TrainYourOwnYOLO/Training_Images/vott-csv-export/cat95723.jpg 114,279,295,464,99
/TrainYourOwnYOLO/Training_Images/vott-csv-export/background.jpg
/TrainYourOwnYOLO/Training_Images/vott-csv-export/cat955534.jpg 114,279,295,464,99

To clarify, it should be just the file path with no coordinates.

The purpose of the giving it images without any objects is 2 fold.

  1. It tightens the crappy bounding boxes we annotated ourselves.
  2. It learns that nothing in this image relates to any of our classes. It will adjust the learned data on all classes absent using this. (I do not know the specifics on exactly how this works)

I added code to my fork of this project that tracks every file_path that is added to the data_train.txt file. Then it looks at the folder and any files that are absent in the data_train.txt are added into it as lines without coordinates.

I hope that clarifies any confusion. If you are still unsure about anything, please ask.

@bertelschmitt
Copy link

@NiklasWilson: Thank you. I will try that on my next training round. I will also try to trace how and where the YOLO trainer is making use of untagged pictures. I have no shortage of those.
I am not worried about managing the pictures. In an episode of sheer madness, I have embarked on writing my own VOTT-like app that also allows:

  • Review and editing of model data (currently, it’s a black box)
  • Use of YOLO to predict bounding boxes and tags
  • Tracking of objects in videos
  • Feedback to correct mis-detection through re-training
  • Address weak spots in model by re-training with pictures detected with low confidence
    I could add any amount of those no-tag pictures at the push of a button
    I already have the rough engine that allows me to create large series of training images in a fraction of time it takes with VOTT. Now I’m sweating writing the user interface in Kivy ….

@NiklasWilson
Copy link
Author

@bertelschmitt I run all my detections on a raspberryPI on 1080@30fps. the PI is obviously to week to do this well. So I wrote my own video detection script that takes into account lost frames and stops detecting as needed while still drawing the original box so the footage stays as live as possible. On slow moving objects this works fine as the delay is minimal (its very noticeable on cars as the car drives out of the drawn box)

Is your new project going to be open source? I would be interested in potentially putting some time and effort into it. If you need a resume here is an outdated one :)
https://niklaswilson.net

@bertelschmitt
Copy link

bertelschmitt commented May 19, 2020

@NiklasWilson: I have a drawer full of Pis, but I'd never try doing this with them. I started my cat project on an Nvidia Nano, but I ditched that also, because it quickly ran out of steam and especially memory. One absolutely needs to use the Tiny YOLO flavors on these, I'm currently running constant detection on a 6 year old PC with a 1080TI, and even that gives me only 18-20 FPS. Training runs on a likewise 1080YI-equipped Threadripper 3970, and it's running quite hot when training. I handle lack of FPS by either cutting the incoming frame rate down to 15 FPS, or by doing a detection only on every 2nd frame while showing all of them.

Of course the project will be open source, if I'm ever no longer embarrassed putting my convoluted code in front of a tough audience. Would love to cooperate. The first step in that project was to make YOLO, while running detections, also write a log in an extended Annotations-export,csv format, adding confidence and frame number. Also I record the video without boxes. Then, I cut the video into stills, file them by frame number. Now I quickly have thousands of annotated pictures. I can sort and edit, delete, change. Then use that to re-train. You get the picture. That core functionality is done. I just need to put it into a GUI, and my last GUI work was with VB, some 20 years ago ...

@bertelschmitt
Copy link

@NiklasWilson: This is the current mock-up of what I tentatively called "Tagger." The dotted-line tag is for making your negative images, Language: Python 3.7. Image and video processing: CV2, ffmpeg. UI: Kivy (I'm at 3 days of Kivy study, and 3 books are on order at Amazon.)

tagger_mockup

@robisen1
Copy link

@bertelschmitt I am by no means even close to an expert on this stuff. I am also not sure of the correct terminology. Doing some quick google searching "negative image sample" seems like the common term. I used the term "zero annotation" because we are literally feeding it images that have no annotated bounding boxes around anything, telling it that nothing in this image relates to any of our classes/objects.

In your case the ideal "negative image" would be the full image with none of the cats in it.
82310707-b4f6ba00-99ff-11ea-962e-5bf5851c31e3
Like the one above but I imagine you already have lots of positive images with that black blob where it should have learned to ignore that. So in this case I would also pick a bunch of random images and crop various sized images that include that black blob in them and use them all as negative images.

Like this.
82310707-b4f6ba00-99ff-11ea-962e-5bf5851c31e3
but I would do this in addition to images of the entire background as those still help improve the accuracy of the cat detection.

The data_train.text should look something like this when the image "background.jpg" has no annotations.

/TrainYourOwnYOLO/Training_Images/vott-csv-export/cat95723.jpg 114,279,295,464,99
/TrainYourOwnYOLO/Training_Images/vott-csv-export/background.jpg
/TrainYourOwnYOLO/Training_Images/vott-csv-export/cat955534.jpg 114,279,295,464,99

To clarify, it should be just the file path with no coordinates.

The purpose of the giving it images without any objects is 2 fold.

  1. It tightens the crappy bounding boxes we annotated ourselves.
  2. It learns that nothing in this image relates to any of our classes. It will adjust the learned data on all classes absent using this. (I do not know the specifics on exactly how this works)

I added code to my fork of this project that tracks every file_path that is added to the data_train.txt file. Then it looks at the folder and any files that are absent in the data_train.txt are added into it as lines without coordinates.

I hope that clarifies any confusion. If you are still unsure about anything, please ask.

@AlexeyAB , I believe the major maintainer and contributor to Darknet Yolo specifically call this out in the read me. Here: https://github.com/AlexeyAB/darknet/blob/master/README.md#how-to-improve-object-detection and the part you want to look for:
"desirable that your training dataset include images with non-labeled objects that you do not want to detect - negative samples without bounded box (empty .txt files) - use as many images of negative samples as there are images with objects"

My understanding is your training the model to recognize an object you are interested in and compare it to a similar image without the object to enforce what you want to detect. There are a number of posts in the comments of the yolo repo about this.

@bertelschmitt
Copy link

I am working hard on my multi-stream, multi-GPU version, and I am beginning to see the light at the end of the tunnel. It naturally involves changes to yolo.py, mostly to the init and detect_image routines. I tried very hard to keep these changed routines compatible with current versions. Detect_video is left alone, I am handling all those streams in a rather big standalone app.

While I am at it, I would very much like to incorporate the “negative image” idea if it is still pertinent. Currently, my version of detect_image will ignore certain labels/objects if told to, but as discussed, this is not the idea here. I haven’t invested a lot of time into “negative” images, so please, help me out:

  • Would negative images (NI) need any changes to yolo.py, and if so, what are they?
  • What changes to the annotation and training workflow are needed to make NI work?
  • Do we know for sure that our YOLO implementation understands NI, and that NI improves its detection quality?

I made a haphazard attempt that involved a few entries in Annotations-export.csv that had images, but no labels, and it broke Convert_to_YOLO_format.py, so I dropped it.

Ideally, and if NI makes a difference, NI should be melded into the workflow at the VOTT level. I’d love to tag an image, or a large part of the image, with a pre-arranged tag (such as “NI”) and Convert_to_YOLO_format.py/Train_YOLO.py would do the rest.

However, I note that the NI idea has not made it into the current version of TrainYourOwnYOLO, and that’s what my work is based upon.

@robisen1
Copy link

robisen1 commented Aug 6, 2020

I am working hard on my multi-stream, multi-GPU version, and I am beginning to see the light at the end of the tunnel. It naturally involves changes to yolo.py, mostly to the init and detect_image routines. I tried very hard to keep these changed routines compatible with current versions. Detect_video is left alone, I am handling all those streams in a rather big standalone app.

While I am at it, I would very much like to incorporate the “negative image” idea if it is still pertinent. Currently, my version of detect_image will ignore certain labels/objects if told to, but as discussed, this is not the idea here. I haven’t invested a lot of time into “negative” images, so please, help me out:

  • Would negative images (NI) need any changes to yolo.py, and if so, what are they?
  • What changes to the annotation and training workflow are needed to make NI work?
  • Do we know for sure that our YOLO implementation understands NI, and that NI improves its detection quality?

I made a haphazard attempt that involved a few entries in Annotations-export.csv that had images, but no labels, and it broke Convert_to_YOLO_format.py, so I dropped it.

Ideally, and if NI makes a difference, NI should be melded into the workflow at the VOTT level. I’d love to tag an image, or a large part of the image, with a pre-arranged tag (such as “NI”) and Convert_to_YOLO_format.py/Train_YOLO.py would do the rest.

However, I note that the NI idea has not made it into the current version of TrainYourOwnYOLO, and that’s what my work is based upon.

Note sure if you already saw this which was embedded early in the thread NiklasWilson@4d314d2

essentially you have to make some minor changes. Note though Anton's code has been refactored some which may make comparing Niklas's changes a little harder since his code base is older. I am not sure you have seen this just before your post but you should read it https://github.com/AlexeyAB/darknet/blob/master/README.md#how-to-improve-object-detection. There is an interesting difference between Yolo not detecting objects in a grid cell during training. In that case when it gets a detection wrong it corrects. In this case of a negative sample your training yolo to essentially exclude samples or sample features like the negative sample. This sounds like the same thing bit it is not. The best way to look at this is to train like a 100 image dataset from scratch and measure performance. Then do the same with appropriate negative examples. If that works well on your data then you should try it. I think some people have mentioned that images with lots of objects in that do not seem to gain from negative samples but I have not tested this.

@jrbastien
Copy link

jrbastien commented Mar 21, 2021

Hi,

I have incorporated NiklasWilson code and also corrected utils.py in order to be ableto use the lines with no bounding boxes in data_train.txt.

I can confirm that it eliminated the false detections from the test set I had. Worth noticing though: some objects where the confidence was about 75% went down to 50%. I guess that the experience may vary depending upon the type of objects you try to recognize and how many classes you have. In my case, I'm detecting a single object.

Not sure why it was never merged. Perhaps because the code in utils.py was never included to support the zero annotations line. So I corrected it and made a pull request (#221)

Also, there was a message "Ignoreable error: Folder/File could not be accessed, may not exist ->" that I thought could mislead the user so I improved it to say "Sometimes it is desirable that your training dataset include images with non-labeled objects that you do not want to detect (i.e. negative samples without bounding box). Non-annotated images can be placed in -> /home/username/Src/TrainYourOwnYOLO/Data/Source_Images/no_object. However, this is not mandatory, you can now continue with the training."

I think that it is complete and ready to use. Feel free to merge into the master if you like. Big thanks to you Anton for this project and to Niklas for the change.

@AntonMu AntonMu added the wontfix This will not be worked on label Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

7 participants