Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I know which recognition area my target recognition box comes from? #1102

Open
hadestyz opened this issue Nov 6, 2023 · 12 comments

Comments

@hadestyz
Copy link

hadestyz commented Nov 6, 2023

Dear author,I added preprocessing to divide the recognition area into two parts, but I don't know how to distinguish which area the target recognition box comes from in the following process

This is my preprocessing configuration:
[group-0]
src-ids=0;1;2;3;4;5
custom-input-transformation-function=CustomAsyncTransformation
process-on-roi=1
roi-params-src-0=0;0;1280;720;272;0;736;720
roi-params-src-1=0;0;1280;720;272;0;736;720
roi-params-src-2=0;0;1280;720;272;0;736;720
roi-params-src-3=0;0;1280;720;272;0;736;720
roi-params-src-4=0;0;1280;720;272;0;736;720
roi-params-src-5=0;0;1280;720;272;0;736;720

This is the func:

def _tiler_sink_pad_buffer_probe(buffer, user_data):
# Retrieve batch metadata from the gst_buffer
try:
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(buffer)
l_frame = batch_meta.frame_meta_list
except:
return DSL_PAD_PROBE_OK

while l_frame is not None:
    try:
        frame_meta = pyds.glist_get_nvds_frame_meta(l_frame.data)
    except StopIteration:
        break
    frame_number = frame_meta.frame_num
    source_id = frame_meta.source_id
    l_obj = frame_meta.obj_meta_list
    boxes = []
    while l_obj is not None:
        try:
            # Casting l_obj.data to pyds.NvDsObjectMeta
            obj_meta = pyds.glist_get_nvds_object_meta(l_obj.data)
        except StopIteration:
            break
        boxes.append([int(obj_meta.detector_bbox_info.org_bbox_coords.left), int(obj_meta.detector_bbox_info.org_bbox_coords.left)+int(obj_meta.detector_bbox_info.org_bbox_coords.width), int(obj_meta.detector_bbox_info.org_bbox_coords.top), int(obj_meta.detector_bbox_info.org_bbox_coords.top)+int(obj_meta.detector_bbox_info.org_bbox_coords.height)])
        try:
            l_obj = l_obj.next
        except StopIteration:
            break

Is there a way to distinguish which recognition area the box comes from?

@rjhowell44
Copy link
Collaborator

rjhowell44 commented Nov 6, 2023

@hadestyz unfortunately you can't... see https://forums.developer.nvidia.com/t/how-can-i-find-out-which-roi-inference-result-came-from-among-the-obj-information-of-obj-meta-list-from-the-preprocess-result/253314/14

We've been working around this limitation by using Duplicate Sources to create duplicate streams so that each stream has only one ROI. It definitely adds overhead, but it allows you to distinguish which ROI the object was detected in.

Wish I had better news for you.

@hadestyz
Copy link
Author

hadestyz commented Nov 6, 2023

I hope it can be achieved one day.Thank you.

@hadestyz hadestyz closed this as completed Nov 6, 2023
@hadestyz
Copy link
Author

hadestyz commented Nov 6, 2023

@rjhowell44 sorry,I would like to ask again, is the Duplicate Source you mentioned different streams on the same pipeline? Are there still two pipelines? Are there any similar examples here for reference?

@hadestyz hadestyz reopened this Nov 6, 2023
@rjhowell44
Copy link
Collaborator

@hadestyz the Duplicate Source creates a duplicate stream for the same Pipeline. You can duplicate the same source (stream) as many times as you like. Then you can define a different ROI for each stream.

For your example above, you have 6 sources with 2 ROIs each. You could make them 12 sources/streams with 1 ROI each. Just know that when you add both the original Sources and the duplicate Sources to the Pipeline, they will be assigned stream-ids in the order you add them.

Keep in mind that you are doubling the number of frames that need processing by the preprocessor, inference engines, and tracker. So a lot of extra overhead.

We typically use a Demuxer for this case and then we don't add branches or Sinks to the duplicate streams. Just let the demuxer drop the buffers... add Branches/Sinks to just the original Streams for viewing, recording, streaming, etc.

Just FYI, as others may run into this problem, I will make time to add a diagram and description of this workaround to the Preproc docs in DSL.

@rjhowell44
Copy link
Collaborator

rjhowell44 commented Nov 6, 2023

@hadestyz Sorry, I misspoke. There is no extra overhead for the Inference engines as the number of ROIs, which equals the batch-size, remains the same. 6 sources with 2 ROIs or 12 sources with 1 ROI means => the batch-size is 12 for both.

@hadestyz hadestyz closed this as completed Nov 7, 2023
@hadestyz
Copy link
Author

hadestyz commented Nov 7, 2023

@rjhowell44 Looking forward to your relevant documents, thank you very much

@hadestyz
Copy link
Author

hadestyz commented Nov 8, 2023

@rjhowell44 For example, after copying multiple streams, each stream corresponds to a different recognition area. How should I ensure that these streams are aligned and synchronized? How should I obtain data from multiple streams, unify their box data, process it centrally, and distribute it downstream? I don't seem see any interfaces in Deepstream that provide merging.

@rjhowell44
Copy link
Collaborator

@hadestyz my apologies for the slow response... very busy. I will try and put together a diagram with some comments tonight.

@hadestyz
Copy link
Author

@rjhowell44 It seems like you’re really busy over there and haven’t had the time to update the relevant documentation. I hope everything is going smoothly for you.

@rjhowell44
Copy link
Collaborator

rjhowell44 commented Nov 14, 2023

@hadestyz again, sorry, very busy. I've gotten as far as creating a diagram for the docs. See below.

image

Her are some important points.

  • The Duplicate Source simply Tees into the Original Source's stream
  • Both the Original Source and the Duplicate Source are linked as input to the streammuxer (pad-0, pad-1).
  • A gstreamer Tee does not copy/duplicate each buffer, but simply pushes duplicate pointers to the same buffer onto both of the streammuxer's input pads.
  • The streammuxer will batch the two streams as if they are two sources, but they are really the same buffers.
  • The preprocessor, inference engine and tracker will process both as separate streams.
  • Now, instead of defining two preprocessor ROIs for the original source, define one for the Original Source, and one for the Duplicate Source.
  • When you're processing the frame and object metadata you can tell which ROI the object is in by which source it is in.
  • When viewing the output, only connect a branch to the Demuxer for stream-0. You can let the Demuxer drop the Duplicate stream.

I hope this. Please follow up with additional questions if you have them.

@rjhowell44 rjhowell44 reopened this Nov 14, 2023
@hadestyz
Copy link
Author

@rjhowell44 Thank you for your answer and for this clear chart. It makes me understand the process clearly.
If I have a Original source-A and its Duplicate source-B, and in the process shown in the above figure, the data flow is still frame by frame, it seems that there are no related components or functions in the deepstream , so I can accumulate two frames and process them once?
But I can do some processing on the probe function, merging the data for processing every time I receive the same number of frames of source-A and source-B data. This requires the data processing speed of source-A and source-B to be similar, in order to avoid having too much backlog of data on my own stack.
So can the streammux plugin on Deepstream ensure that source-A and source-B are passed to the inference engine as evenly as possible?

@rjhowell44
Copy link
Collaborator

@hadestyz the Pipeline's built-in streammuxer batches the frames from the original and duplicate sources and adds the metadata for each frame... meaning the pointers to the buffers are batched together so they can be processed as a batch by the inference engine and tracker. If you add your custom pph to the source-pad (output) of the tracker you will iterate through the batched metadata. It is the Demuxer that un-batches the buffers into separate streams.

This code (the custom pph) that you had above, is iterating on all frames in the batch. If all sources are the same frame-rate, then everything will be synchronized.

while l_frame is not None:
    try:
        frame_meta = pyds.glist_get_nvds_frame_meta(l_frame.data)
    except StopIteration:
        break
    frame_number = frame_meta.frame_num
    source_id = frame_meta.source_id
    l_obj = frame_meta.obj_meta_list
    boxes = []
    while l_obj is not None:
        try:
            # Casting l_obj.data to pyds.NvDsObjectMeta
            obj_meta = pyds.glist_get_nvds_object_meta(l_obj.data)
        except StopIteration:
            break
        boxes.append([int(obj_meta.detector_bbox_info.org_bbox_coords.left), int(obj_meta.detector_bbox_info.org_bbox_coords.left)+int(obj_meta.detector_bbox_info.org_bbox_coords.width), int(obj_meta.detector_bbox_info.org_bbox_coords.top), int(obj_meta.detector_bbox_info.org_bbox_coords.top)+int(obj_meta.detector_bbox_info.org_bbox_coords.height)])
        try:
            l_obj = l_obj.next
        except StopIteration:
            break

The only think you have to worry about is if you have overlapping ROIs. Then everything gets a little trickier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants