Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue about the thumos14_test_normalized_proposal_list.txt #13

Open
suhaisheng opened this issue Oct 29, 2017 · 18 comments
Open

issue about the thumos14_test_normalized_proposal_list.txt #13

suhaisheng opened this issue Oct 29, 2017 · 18 comments

Comments

@suhaisheng
Copy link

hello guy,I just try to reproduce your amazing work, and for convenience(computational cost),I just use 213 videos instead which are used later for test in thumos14 dataset eval toolkit.But I find that there might be something wrong about your groundtruth annotation in thumos14_tag_test_normalized_proposal_list.txt file.For example, you can check your groundtruth annotations in following three videos:video_test_0001292、video_test_0000270、video_test_0001496.
In your .txt file,these three videos are negative with 0 gt instance,however in thumos14 dataset test annotation , all of them include several groundtruth action instances.So when i run the ssn test python file,my video numbers decrease from 213 to 210,and the final reproducing results tend to be lower than yours listed in the paper(about 1.5% difference).WAITING FOR YOUR REPLY, thx so much!

@yjxiong
Copy link
Owner

yjxiong commented Oct 29, 2017

If I remember correctly, these 3 videos have incorrect annotations which are sitting beyond the videos’ time span.

In terms of testing results, do you have specific numbers and settings for me to look at?

@suhaisheng
Copy link
Author

OK!firstly i only use the 213 videos which really make sense in the evaluation process,and run the
command: python gen_proposal_list.py thumos14 ./thumos14/Test/
cause the directory structure of my dataset is a little different from yours.
(like Test/[img,flow_x,flow_y]/video_name/%frame_id.jpg)So I change your function code(def _load_image(self,directory,idx)in order to adapt my dataset structure to load the corresponding images.
Then I continue my reproducing work by running the testing command:
python ssn_test.py thumos14 RGB none result_inceptionv3_thumos14_imagenet_rgb.npz --arch InceptionV3 --use_reference
My evaluation command is as follows:
python eval_detection_results.py thumos14 result_inceptionv3_thumos14_imagenet_rgb.npz
The RGB modality result is:
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.3946 | 0.3476 | 0.2862 | 0.2157 | 0.1470 | 0.0896 | 0.0488 | 0.0222 | 0.0039 |0.1729 |
The Flow modality testing and evaluation process are almost the same as above:
The Flow modality result is:
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.4541 | 0.4085 | 0.3530 | 0.2883 | 0.2184 | 0.1456 | 0.0851 | 0.0365 | 0.0076 |0.2219 |
In order to get the final RGB+Flow modality result,I run the command:
python eval_detection_results.py thumos14 result_inceptionv3_thumos14_imagenet_rgb.npz result_inceptionv3_thumos14_imagenet_flow.npz
And the result is listed as follows:
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.5599 | 0.5033 | 0.4351 | 0.3479 | 0.2622 | 0.1680 | 0.0946 | 0.0437 | 0.0086 |0.2692 |
as above, finally i get the reproducing result of RGB+Flow modality @0.5 IoU is 0.2622.It seems a little different from your 28.00(29.8*) in the paper,can you explain why the situation happens or how to avoid it and give your valuable advice,I would appreciate it!

@yjxiong
Copy link
Owner

yjxiong commented Oct 31, 2017

The performance of RGB and flow you got are both lower than the reference results on our machine. My guess is that it may be due to your modified data loading routines. If you would like to upload the generated proposal lists I may be able to help you.

@suhaisheng
Copy link
Author

@bityangke
Copy link

Hi Yuanjun,
If I flip optical flow as RGB image when training the optical flow network, will there be a big difference in the result?
I noticed that in caffe version, you seem directly flip optical flow as RGB image.

@yjxiong
Copy link
Owner

yjxiong commented Oct 31, 2017

@bityangke
If you mean the Caffe version of TSN, we do invert the pixel values after flipping the optical flow images. Although the performance difference is not significant, it makes the system technically sound and may help in cases where this flipping matters.

@shuangshuangguo
Copy link

@suhaisheng Hi, I had worse result of flow modality, but I don't know how to fix it? Could you please share your thumos14_flow_score.npz? I just want to verify where is the problem. Thank you very much!
And, this is my issue, #12, could you help me?

@suhaisheng
Copy link
Author

thumos14_bn_flow_score.zip

this is my reproducing result of Flow modality(unzip it then you can get the .npz file).Note that there is still 1.7% different from the paper.

@shuangshuangguo
Copy link

@suhaisheng Thank you very much!!
but it's so werid. When I use your .npz file and proposal file to evaluate flow result, result is different from yours either, my flow result is:

+Detection Performance on thumos14------+--------+--------+--------+--------+--------+--------+---------+
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
+------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+---------+
| mean AP | 0.4344 | 0.3977 | 0.3406 | 0.2765 | 0.2083 | 0.1470 | 0.0853 | 0.0340 | 0.0048 | 0.2143 |
+------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+---------+

eval_detection_results.py only need two files, I think if we use the same files, we should get same result. Maybe I should check my code again.

@suhaisheng
Copy link
Author

The result is same as mine(you may mistake my another experiment result(--arch InceptionV3)for it)..What I am curious about is why the RGB modality result you got is higher than mine and even higher than the authors' listed in the paper.

@Tord-Zhang
Copy link

@suhaisheng Have you figure out why? I found that in thumos14_tag_val_normalized_proposal_list.txt,there are also many videos with no groundtruth. That just makes no sense

@jiaozizhao
Copy link

Hi, @suhaisheng. Could you explain what´s the meaning of each line in the thumos14_tag_val_normalized_proposal_list.txt?

@yjxiong
Copy link
Owner

yjxiong commented Feb 5, 2018

@jiaozizhao
Copy link

Hi @yjxiong . Thanks. And could you explain the result after running ssn_test.py? I noted there are four arrays for each video. Could you explain them? If I want to visualize the results, what information I should use? Thanks.

@yjxiong
Copy link
Owner

yjxiong commented Feb 7, 2018

@jiaozizhao
Copy link

Hi @yjxiong. Thank you very much. And sorry for not reading the code carefully due to my urgency. I will read them.

@quanh1990
Copy link

Hello, do you know how to generate this normalized_proposal_list.txt on other videos
@suhaisheng

@mrlihellohorld
Copy link

Hello, do you know how to generate this normalized_proposal_list.txt on other videos
@suhaisheng
have u resolved it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants