How do I evaluate accuracy of the test set? #613

offchan42 · 2018-03-05T15:10:21Z

One way is to calculate "Mean average precision: mAP".
But I'm not aware of this feature implemented in darkflow.
Do you have any suggestions on which darknet repo or person has written a script to do this?

sandeeprepakula · 2018-03-06T06:40:10Z

https://github.com/thtrieu/darkflow/blob/master/darkflow/utils/box.py has basic implementation needed for evaluating or computing the overlap between two images. You can modify it as per your need and compute the mean average precision as a multiplication of probability of class and box_iou

offchan42 · 2018-03-07T02:31:13Z

Isn't there anyone who has done it before?
No one tests their accuracy?
I don't know why such important stuff is not yet implemented.
If there is none, maybe I'll need to implement this feature by myself and commit it for good.

bravma · 2018-03-08T09:06:54Z

I would also be interested in this feature.

ilayoon · 2018-03-12T12:07:23Z

@off99555, I saw this guy working on F1 score.
#377

andikira · 2018-03-29T14:26:55Z

Hello @off99555 , i am searching for calculate mAP from my own test data too.
I think @sandeeprepakula have a good answer, can you explain step by step how to calculate mAP from iou box ? i'm very noob in deep learning and object detection

Thanks for your help

offchan42 · 2018-03-29T17:28:37Z

@andikira I've not done it yet. I'm not sure too.

andikira · 2018-03-30T16:49:03Z

Hello @off99555 , you have to see this repo https://github.com/Cartucho/mAP then see the extra folder, you can convert .xml and .json file, that repo works perfectly with darkflow.

bareblackfoot · 2018-04-18T04:40:13Z

@andikira @thtrieu I trained the yolo2-voc network on PASCAL V0C 0712 trainval set and tested on PASCAL VOC 2007 test set, and the mAP performance is only around 52% (after quite a lot of epoches (200~) considering that I initialized it with pretrained weights). It is quite below the official performance(76.8% mAP). I want to include experimental results with my idea using YOLO v2 on my paper. Could anyone know the reason?

andikira · 2018-04-19T09:46:07Z

i have seen that issue before, please check another issue @bareblackfoot

offchan42 · 2018-05-09T02:12:53Z

@andikira The mAP repo works wonderfully. I contributed a script there too. Thank you!

srhtyldz · 2018-12-16T13:55:08Z

Can you

@andikira The mAP repo works wonderfully. I contributed a script there too. Thank you!

Could you please explain more specificially ? how does it work ? how did you get the results?

offchan42 · 2018-12-17T19:32:24Z

@srhtyldz Go to the https://github.com/Cartucho/mAP website and go to quickstart section.
The idea is that you need to clone the repo, predict your images using darkflow and then you copy the prediction files into the mAP folder. Also copy the ground-truth files.
And you do everything as the README file of mAP repo suggested and you should see mAP as a single number output.

offchan42 · 2018-12-17T19:35:36Z

Also you need to convert your files from darkflow format to their format. They provide the script to convert darkflow format to their format here: https://github.com/Cartucho/mAP/tree/master/extra
Read the README.

srhtyldz · 2018-12-17T20:06:55Z

Also you need to convert your files from darkflow format to their format. They provide the script to convert darkflow format to their format here: https://github.com/Cartucho/mAP/tree/master/extra
Read the README.

I found out that darknet has map property .Did you try that ? Is there a difference between them or which one is more accurate ?

offchan42 · 2018-12-17T20:37:36Z

@srhtyldz I think that the algorithm is the same. Except that mAP is an algorithm that has some parameters you can set. Usually, this is manually set by the writer of the algorithm. So the only time you see they report different numbers is when they have different parameters.
I suggest you read the explanation of how mAP works so that you can understand its hidden parameter.
I tried them both. I went with darknet in my thesis because darkflow does not have the option to specify the output folder. It's critical for my senior project because I run the code on FloydHub which is a deep learning cloud that only allows you to use certain folders.

offchan42 · 2018-12-17T20:38:34Z

First I use darkflow with mAP and then I later find out that I cannot specify output path, so I switched to darknet and use its internal mAP computation.

offchan42 · 2018-12-17T20:42:31Z

And most importantly, the owner of darknet repo (the fork version with AlexeyAB as the owner) is very helpful. He answers all the issues I posted which made me finished before the deadline. It was very stressful back then. I am very thankful of him, I also wrote his name in the thesis.

srhtyldz · 2018-12-17T20:42:33Z

First I use darkflow with mAP and then I later find out that I cannot specify output path, so I switched to darknet and use its internal mAP computation.

Which variable do you want to specify ?

First I use darkflow with mAP and then I later find out that I cannot specify output path, so I switched to darknet and use its internal mAP computation.

My opinion is also for darknet.I think you can only switch with threshold by calculation mAP ? right ?

offchan42 · 2018-12-17T21:13:53Z

Which variable do you want to specify ?

@srhtyldz In darkflow about half a year ago, when I started this thread, doesn't allow me to specify the output folder path. The output folder path is the path where the prediction images are written.
If I remember correctly, the output folder is set to out folder inside the folder of your dataset.
E.g. if you have some images inside img/, the predictions will be inside img/out/.
In FloydHub, they don't allow you to write the output to the dataset folder. So if the dataset folder is img/, you can't create a new folder as img/out because you will encounter permission denied.

Maybe it allows you to specify the output path now but you have to check. I am not sure.

My opinion is also for darknet.I think you can only switch with threshold by calculation mAP ? right ?

If I remember correctly, mAP has one parameter that the writer needs to define. It is the threshold from 0 to 1 that you decide if an image is correctly predicted. Usually, this is set to 0.5. So if the area ratio goes above 0.5, it means the correct image.
And after that you also have to define the Average Precision. To fine the average, you need to find out how many precisions there are. It turns out that you can also define the number of precisions. Usually, the number of precisions is set to 11 so that you can line them up from 0, 0.1, 0.2, ..., 0.9, 1.0.

I am not sure what I'm talking about though. Because I have already forgotten most of it. You have to learn it to make sure that I'm right. But I'm pretty sure about the number 0.5 and 11 thing. I just don't know what they are called.

PS. I went back to find my thesis slide (when I know what mAP means), here is how I described it back then:

offchan42 · 2018-12-17T21:15:29Z

srhtyldz · 2018-12-17T21:17:47Z

Which variable do you want to specify ?

@srhtyldz In darkflow about half a year ago, when I started this thread, doesn't allow me to specify the output folder path. The output folder path is the path where the prediction images are written.
If I remember correctly, the output folder is set to out folder inside the folder of your dataset.
E.g. if you have some images inside img/, the predictions will be inside img/out/.
In FloydHub, they don't allow you to write the output to the dataset folder. So if the dataset folder is img/, you can't create a new folder as img/out because you will encounter permission denied.

Maybe it allows you to specify the output path now but you have to check. I am not sure.

My opinion is also for darknet.I think you can only switch with threshold by calculation mAP ? right ?

If I remember correctly, mAP has one parameter that the writer needs to define. It is the threshold from 0 to 1 that you decide if an image is correctly predicted. Usually, this is set to 0.5. So if the area ratio goes above 0.5, it means the correct image.
And after that you also have to define the Average Precision. To fine the average, you need to find out how many precisions there are. It turns out that you can also define the number of precisions. Usually, the number of precisions is set to 11 so that you can line them up from 0, 0.1, 0.2, ..., 0.9, 1.0.

I am not sure what I'm talking about though. Because I have already forgotten most of it. You have to learn it to make sure that I'm right. But I'm pretty sure about the number 0.5 and 11 thing. I just don't know what they are called.

PS. I went back to find my thesis slide (when I know what mAP means), here is how I described it back then:

Thanks a lot . I'll research mAP and IoU. I got the result but i couldn't interpret them.Can i ask one more question ? did you use 'recall' function ? map and recall function have same functionality or not ? do you have any ideas_?

offchan42 · 2018-12-17T21:21:11Z

@srhtyldz mAP is calculated from precision as a function of recall. So precision and recall are included already. mAP is a popular and conventional evaluation metric for object detection tasks. So you don't have to do recall or precision individually. Just compute mAP and if it's high then it means your model performs well.

srhtyldz · 2018-12-17T21:28:48Z

@srhtyldz mAP is calculated from precision as a function of recall. So precision and recall are included already. mAP is a popular and conventional evaluation metric for object detection tasks. So you don't have to do recall or precision individually. Just compute mAP and if it's high then it means your model performs well.

I got 90.4% from mAP.I think it performs well.

offchan42 · 2018-12-18T10:06:17Z

@srhtyldz That's very high if you have many classes. (Popular models on big dataset only does around 50-60%) But even if you have only two classes, it's still good.
Make sure to check the predicted images using human eyes to verify if the mAP is reasonable.

srhtyldz · 2018-12-20T17:03:15Z

@srhtyldz That's very high if you have many classes. (Popular models on big dataset only does around 50-60%) But even if you have only two classes, it's still good.
Make sure to check the predicted images using human eyes to verify if the mAP is reasonable.

I have only one class.Also, i saved all images with bounding box.
Did you draw the mAP chart ? In Alex repo, there is mAP chart and while you're training your model, you should put ' -map ' to get chart .I didn't put this word.Even you put that , how could you get the mAP chart ?Did you try it? also did you try ' valid ' function ? do you know the differences ?
I know I asked many questions but your answers are so important to my thesis :).

offchan42 · 2018-12-20T22:43:02Z

From darknet, I just evaluate mAP using its terminal only and take snapshot of the console.
I don't know about the valid function.
For the mAP chart, it's possible if you use darkflow and use mAP repo above. There is a script which generates the animation and charts.
I think the mAP repo also work with darknet maybe.

srhtyldz · 2018-12-22T11:34:25Z

From darknet, I just evaluate mAP using its terminal only and take snapshot of the console.
I don't know about the valid function.
For the mAP chart, it's possible if you use darkflow and use mAP repo above. There is a script which generates the animation and charts.
I think the mAP repo also work with darknet maybe.

The problem is that i don't know how animation will be run.I tried to do that while training but i couldnt get animation for map.@AlexeyAB maybe will help about this.

reethimalisetty · 2019-06-25T06:06:57Z

I got mAP of my test results. From 56 images I got 48 images correct for my testing. I have four classes . How to make confusion matrix from that . Can someone help me .

musimab · 2020-04-28T19:45:47Z

@srhtyldz I used /darknet detector map cfg/aerial.data cfg/test_tiny.cfg backup\my_yolov3_tiny_final.weights function but I could not get any score or graph at the end of the training. How did you get mAP score in darknet

colinsenner · 2020-05-11T16:55:21Z

@mustafabuyuk Pass the -map argument to ./darknet (darknet.exe). It will show up on the graph at around 1000 iterations. I'm not sure if it's needed, but I also have a validation set of images too.

irfan-gif · 2021-04-11T19:26:34Z

Can anyone have idea? YOLOv5 calculates mAP against validation dataset during training. Though which command we can calculate the against unseen test dataset (mAP, and Class wise mAP)

sandeepmohanadasan · 2021-07-06T05:54:49Z

@off99555 The Thread seems closed. Do you get the answer?
I have a set of Images and Yolo Annotation files(in txt format) for the validation.

How to use -map argument on validating the Images to get the mAP score? Will it be possible to derive the confusion matrix from this?

offchan42 · 2021-07-06T06:01:13Z

@sandeeprepakula Read my answer directly before I closed the issue. That's how you can calculate mAP.
But I would suggest further, don't use this repo anymore. I haven't worked with object detection for a few months now so not sure about which repo is the best but from a quick look at the last commit of this repo, it was in Feb 2020. Do you think a repo that's not been updated for a year is worth using? If you get stuck on something no one will help you. Consider the community part as well when choosing which repo to use.

sandeepmohanadasan · 2021-07-06T06:07:35Z

@off99555 Thanks for the suggestion

offchan42 changed the title ~~How do I evaluate the test set or validation set?~~ How do I evaluate accuracy of the test set? Mar 5, 2018

offchan42 closed this as completed May 9, 2018

offchan42 mentioned this issue May 11, 2018

Add "Evaluating accuracy" section #752

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I evaluate accuracy of the test set? #613

How do I evaluate accuracy of the test set? #613

offchan42 commented Mar 5, 2018

sandeeprepakula commented Mar 6, 2018

offchan42 commented Mar 7, 2018 •

edited

bravma commented Mar 8, 2018

ilayoon commented Mar 12, 2018

andikira commented Mar 29, 2018

offchan42 commented Mar 29, 2018

andikira commented Mar 30, 2018 •

edited

bareblackfoot commented Apr 18, 2018 •

edited

andikira commented Apr 19, 2018

offchan42 commented May 9, 2018

srhtyldz commented Dec 16, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 18, 2018 •

edited

srhtyldz commented Dec 20, 2018

offchan42 commented Dec 20, 2018 •

edited

srhtyldz commented Dec 22, 2018

reethimalisetty commented Jun 25, 2019

musimab commented Apr 28, 2020

colinsenner commented May 11, 2020

irfan-gif commented Apr 11, 2021

sandeepmohanadasan commented Jul 6, 2021

offchan42 commented Jul 6, 2021 •

edited

sandeepmohanadasan commented Jul 6, 2021

How do I evaluate accuracy of the test set? #613

How do I evaluate accuracy of the test set? #613

Comments

offchan42 commented Mar 5, 2018

sandeeprepakula commented Mar 6, 2018

offchan42 commented Mar 7, 2018 • edited

bravma commented Mar 8, 2018

ilayoon commented Mar 12, 2018

andikira commented Mar 29, 2018

offchan42 commented Mar 29, 2018

andikira commented Mar 30, 2018 • edited

bareblackfoot commented Apr 18, 2018 • edited

andikira commented Apr 19, 2018

offchan42 commented May 9, 2018

srhtyldz commented Dec 16, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 17, 2018

srhtyldz commented Dec 17, 2018

offchan42 commented Dec 18, 2018 • edited

srhtyldz commented Dec 20, 2018

offchan42 commented Dec 20, 2018 • edited

srhtyldz commented Dec 22, 2018

reethimalisetty commented Jun 25, 2019

musimab commented Apr 28, 2020

colinsenner commented May 11, 2020

irfan-gif commented Apr 11, 2021

sandeepmohanadasan commented Jul 6, 2021

offchan42 commented Jul 6, 2021 • edited

sandeepmohanadasan commented Jul 6, 2021

offchan42 commented Mar 7, 2018 •

edited

andikira commented Mar 30, 2018 •

edited

bareblackfoot commented Apr 18, 2018 •

edited

offchan42 commented Dec 18, 2018 •

edited

offchan42 commented Dec 20, 2018 •

edited

offchan42 commented Jul 6, 2021 •

edited