Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction logs #358

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

fmigneault
Copy link

I would like to contribute some piece of code I added to provide logging capabilities of video predictions to file.
This is useful to extract raw actions (class labels and confidence) over video segments.

The idea is simple. I override the draw_clip_range method of the demo visualizer to report predicted actions and bounding boxes to a predictions.log file instead of written onto output video frames.

To preserve original behavior of the demo visualizer, I add a option DEMO.OUTPUT_DISPLAY that is by default running the original demo code (eg: display video in window or written to file). When set to False, the logging override is used instead. This will write what would otherwise be drawn on the frames into text form in the log file under OUTPUT_DIR.

The resulting log will be similar to the following :

0000 [00000000, 00000063]:
    bbox: [490.73, 83.06, 664.0, 415.62], is predicted to class [0.96] stand, top-k=5: ['stand', 'listen to (a person)', 'watch (a person)', 'talk to (e.g., self, a person, a group)', 'carry/hold (an object)'], [0.9636, 0.5954, 0.5248, 0.3115, 0.2907]
    bbox: [152.0, 125.62, 306.25, 307.59], is predicted to class [0.32] talk to (e.g., self, a person, a group), top-k=5: ['talk to (e.g., self, a person, a group)', 'carry/hold (an object)', 'stand', 'watch (a person)', 'listen to (a person)'], [0.3207, 0.2872, 0.2723, 0.2469, 0.2244]
    bbox: [384.1, 90.81, 522.25, 281.81], is predicted to class [0.85] stand, top-k=5: ['stand', 'talk to (e.g., self, a person, a group)', 'listen to (a person)', 'carry/hold (an object)', 'watch (a person)'], [0.8543, 0.503, 0.503, 0.433, 0.3979]
0001 [00000000, 00000063]:
    bbox: [56.78, 62.19, 359.23, 418.54], is predicted to class [0.42] stand, top-k=5: ['stand', 'watch (a person)', 'sit', 'listen to (a person)', 'talk to (e.g., self, a person, a group)'], [0.4152, 0.3692, 0.3509, 0.2042, 0.1815]
    bbox: [300.56, 73.32, 658.27, 400.85], is predicted to class [0.72] stand, top-k=5: ['stand', 'watch (a person)', 'listen to (a person)', 'carry/hold (an object)', 'talk to (e.g., self, a person, a group)'], [0.7185, 0.4697, 0.4342, 0.4063, 0.2994]
0002 [00000064, 00000127]:
    bbox: [73.31, 63.27, 367.26, 430.84], is predicted to class [0.67] stand, top-k=5: ['stand', 'listen to (a person)', 'talk to (e.g., self, a person, a group)', 'sit', 'watch (a person)'], [0.6749, 0.3374, 0.2068, 0.2064, 0.1532]
    bbox: [325.06, 80.53, 645.83, 420.41], is predicted to class [0.77] sit, top-k=5: ['sit', 'talk to (e.g., self, a person, a group)', 'listen to (a person)', 'watch (a person)', 'stand'], [0.7728, 0.4179, 0.2485, 0.2396, 0.1557]
0003 [00000064, 00000127]:
    bbox: [299.5, 107.28, 712.75, 428.47], is predicted to class [0.55] stand, top-k=5: ['stand', 'watch (a person)', 'sit', 'walk', 'carry/hold (an object)'], [0.5498, 0.3916, 0.18, 0.154, 0.1356]
[...]

Each new sampled "clip section" is marked with <clip/task-id> [<start-frame>, <end-frame>], and then provides the predicted actions for each detected bounding box.

Above results where obtained using AVA checkpoint & classes, and Detectron2 predictor for bounding boxes.
The top-k mode and k=5 were used to generate these results, but outputs will adjust accordingly with thres mode or other values of k, in the same manner the original visualizer did.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 18, 2021
@doursand
Copy link

doursand commented Mar 1, 2021

@fmigneault thanks a lot , this is precisely what I was doing as well on my side , as I also need this functionality :-)
Just one suggestion though, i might be a good idea to organize the output in a more csv-ish format so that it could be easily fed into a panda dataframe. Also , I think it could be interesting to have an option to have both the demo output file AND the preds stored in a csv at the same time. But regardless, this is already good stuff, so thanks !

@fmigneault
Copy link
Author

@doursand

i might be a good idea to organize the output in a more csv-ish format

I agree. The format could be adjusted to facilitate parsing. The format I proposed was good enough for my needs but it should be very straightforward to add an option to select the output format and have it log items line by line with the corresponding values of each prediction.

fmigneault added a commit to crim-ca/FrVD that referenced this pull request Aug 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants