Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any tips on how to use Norfair for re-identification accross cameras #312

Closed
GeorgePearse opened this issue Apr 10, 2024 · 1 comment
Closed
Labels
help wanted Extra attention is needed

Comments

@GeorgePearse
Copy link

I work on a project with 15 cameras, where each is on a neighbouring bit of space.

We send a notification to the client the first time we see an object. Normally an object will only appear in one of those cameras, but it would be nice to protect against a case where it moves across all 15, and prevent sending 15 notifications. How could we go about implementing this?

I can't really workout how to think about it. Re-identification is key, but how do we share a kind of re-identification pool over all of the streams.

@GeorgePearse GeorgePearse added the help wanted Extra attention is needed label Apr 10, 2024
@aguscas
Copy link
Collaborator

aguscas commented Apr 10, 2024

Hello! We are working on multi camera support for Norfair in this pull request. That pull request is still waiting to be reviewed before merging it, but that may take a while since the rest of the team is a little busy currently. Either way, if you don't want to wait until then you can try that PR yourself, but remember that it hasn't yet been tested by the rest of the team since I implemented it, so it is possible that you might run into some problems. Don't hesitate to ask me if you need any help with that.

I made a demo, where the user first uses a UI to associate the coordinates between the different videos (to create a common reference frame for all the videos), and use that information to match the trackers. Since you mention that there is practically no overlap between the regions recorded by your cameras, you should only compare embeddings of the objects (i.e: how they look) and not so much their spatial position.

For that you will need to do some adjustments to that demo, like removing the parts where I set and use the initial_transformations variable (which I use to define the common reference frame), and also the distance function used by the MultiCameraClusterizer should only use the embeddings and not the spatial position (in that demo, you can see I defined the clusterier_distance, which uses the spatial position using the normalized_foot_distance, and when they were close I looked at the embeddings with the embedding_distance function).

The output of said demo aggregates all the videos to a single video, showing the bounding boxes of the tracked objects on each video, with the same id and color when they correspond to the same real object. Here is an example I made with that script using footage from the EPFL dataset. I am providing you this because I haven't yet put a gif in the README showing an example of the expected output.

output_3.mp4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants