-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I know which recognition area my target recognition box comes from? #1102
Comments
@hadestyz unfortunately you can't... see https://forums.developer.nvidia.com/t/how-can-i-find-out-which-roi-inference-result-came-from-among-the-obj-information-of-obj-meta-list-from-the-preprocess-result/253314/14 We've been working around this limitation by using Duplicate Sources to create duplicate streams so that each stream has only one ROI. It definitely adds overhead, but it allows you to distinguish which ROI the object was detected in. Wish I had better news for you. |
I hope it can be achieved one day.Thank you. |
@rjhowell44 sorry,I would like to ask again, is the Duplicate Source you mentioned different streams on the same pipeline? Are there still two pipelines? Are there any similar examples here for reference? |
@hadestyz the Duplicate Source creates a duplicate stream for the same Pipeline. You can duplicate the same source (stream) as many times as you like. Then you can define a different ROI for each stream. For your example above, you have 6 sources with 2 ROIs each. You could make them 12 sources/streams with 1 ROI each. Just know that when you add both the original Sources and the duplicate Sources to the Pipeline, they will be assigned stream-ids in the order you add them. Keep in mind that you are doubling the number of frames that need processing by the preprocessor, inference engines, and tracker. So a lot of extra overhead. We typically use a Demuxer for this case and then we don't add branches or Sinks to the duplicate streams. Just let the demuxer drop the buffers... add Branches/Sinks to just the original Streams for viewing, recording, streaming, etc. Just FYI, as others may run into this problem, I will make time to add a diagram and description of this workaround to the Preproc docs in DSL. |
@hadestyz Sorry, I misspoke. There is no extra overhead for the Inference engines as the number of ROIs, which equals the batch-size, remains the same. 6 sources with 2 ROIs or 12 sources with 1 ROI means => the batch-size is 12 for both. |
@rjhowell44 Looking forward to your relevant documents, thank you very much |
@rjhowell44 For example, after copying multiple streams, each stream corresponds to a different recognition area. How should I ensure that these streams are aligned and synchronized? How should I obtain data from multiple streams, unify their box data, process it centrally, and distribute it downstream? I don't seem see any interfaces in Deepstream that provide merging. |
@hadestyz my apologies for the slow response... very busy. I will try and put together a diagram with some comments tonight. |
@rjhowell44 It seems like you’re really busy over there and haven’t had the time to update the relevant documentation. I hope everything is going smoothly for you. |
@hadestyz again, sorry, very busy. I've gotten as far as creating a diagram for the docs. See below. Her are some important points.
I hope this. Please follow up with additional questions if you have them. |
@rjhowell44 Thank you for your answer and for this clear chart. It makes me understand the process clearly. |
@hadestyz the Pipeline's built-in streammuxer batches the frames from the original and duplicate sources and adds the metadata for each frame... meaning the pointers to the buffers are batched together so they can be processed as a batch by the inference engine and tracker. If you add your custom pph to the source-pad (output) of the tracker you will iterate through the batched metadata. It is the Demuxer that un-batches the buffers into separate streams. This code (the custom pph) that you had above, is iterating on all frames in the batch. If all sources are the same frame-rate, then everything will be synchronized. while l_frame is not None:
try:
frame_meta = pyds.glist_get_nvds_frame_meta(l_frame.data)
except StopIteration:
break
frame_number = frame_meta.frame_num
source_id = frame_meta.source_id
l_obj = frame_meta.obj_meta_list
boxes = []
while l_obj is not None:
try:
# Casting l_obj.data to pyds.NvDsObjectMeta
obj_meta = pyds.glist_get_nvds_object_meta(l_obj.data)
except StopIteration:
break
boxes.append([int(obj_meta.detector_bbox_info.org_bbox_coords.left), int(obj_meta.detector_bbox_info.org_bbox_coords.left)+int(obj_meta.detector_bbox_info.org_bbox_coords.width), int(obj_meta.detector_bbox_info.org_bbox_coords.top), int(obj_meta.detector_bbox_info.org_bbox_coords.top)+int(obj_meta.detector_bbox_info.org_bbox_coords.height)])
try:
l_obj = l_obj.next
except StopIteration:
break The only think you have to worry about is if you have overlapping ROIs. Then everything gets a little trickier. |
Dear author,I added preprocessing to divide the recognition area into two parts, but I don't know how to distinguish which area the target recognition box comes from in the following process
This is my preprocessing configuration:
[group-0]
src-ids=0;1;2;3;4;5
custom-input-transformation-function=CustomAsyncTransformation
process-on-roi=1
roi-params-src-0=0;0;1280;720;272;0;736;720
roi-params-src-1=0;0;1280;720;272;0;736;720
roi-params-src-2=0;0;1280;720;272;0;736;720
roi-params-src-3=0;0;1280;720;272;0;736;720
roi-params-src-4=0;0;1280;720;272;0;736;720
roi-params-src-5=0;0;1280;720;272;0;736;720
This is the func:
def _tiler_sink_pad_buffer_probe(buffer, user_data):
# Retrieve batch metadata from the gst_buffer
try:
batch_meta = pyds.gst_buffer_get_nvds_batch_meta(buffer)
l_frame = batch_meta.frame_meta_list
except:
return DSL_PAD_PROBE_OK
Is there a way to distinguish which recognition area the box comes from?
The text was updated successfully, but these errors were encountered: