-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stbt.match: Merge neighbouring ROIs from previous pyramid level #558
base: main
Are you sure you want to change the base?
Conversation
Ironically the old sqdiff-normed @ 0.80 takes 54ms: better than the current master, but not as fast as this PR. Because the old threshold is more generous, the ROIs were more likely to be joined already. |
c19fbac
to
ea2ab2b
Compare
I have also made the kernel 11x11 instead of 11x1 (horizontal). Because why not, it should handle more cases. Performance of the Note that this filter is square (all ones). If I use an elliptical filter, |
5164d0c
to
3f3fcb2
Compare
Rebased onto master to get test fixes from #560. |
Reducing the kernel size of the closing operation at the smallest pyramid level to (5,5) seems to be an improvement but still isn't unambiguously better than master: Greatest improvement (absolute): 133ms (37%)
|
3f3fcb2
to
f6b02ac
Compare
There are almost as many regressions in performance as there are improvements, so I'm giving up on this idea. |
Reopening -- I want to revisit this now that we've merged #569. |
f6b02ac
to
b381d13
Compare
If the previous pyramid level (that is, running `cv2.matchTemplate` on a smaller version of the image) identified many Regions Of Interest (ROIs) that are small and close together, then at the next pyramid level we have to invoke `cv2.matchTemplate` (now on the full-sized image) multiple times. This is still a worthwhile optimisation compared to running `cv2.matchTemplate` on the entire full-sized image; but if we merge ROIs that are very close together, it's even faster. This effect is particularly pronounced, now that we've tightened the match threshold, if you're looking for an image of a word and the frame has a full paragraph of text; the word might (tentatively) match at every word in the paragraph, but not at the spaces between the words. In the above scenario, I measured 47ms vs. 69ms, on my laptop. N.B. The morphological operation "Closing" means to dilate and then erode -- it has the effect of closing holes or gaps. Using a square kernel (all ones) is 4 times faster than using an elliptical kernel (250µs for the `c2v.morphologyEx` vs. 1.1ms).
b381d13
to
f36b895
Compare
If the previous pyramid level (that is, running
cv2.matchTemplate
on asmaller version of the image) identified Regions Of Interest (ROIs)
that are small and close together, then at the next pyramid level we
have to invoke
cv2.matchTemplate
(now on the full-sized image)multiple times.
This is still a worthwhile optimisation compared to running
cv2.matchTemplate
on the entire full-sized image; but if we mergeROIs that are very close together, it's even faster.
This effect is particularly pronounced, now that we've tightened the
match threshold, if you're looking for an image of a word and the frame
has a full paragraph of text; the word might (tentatively) match at
every word in the paragraph, but not at the spaces between the words.
In the above scenario, I measured 47ms vs. 69ms, on my laptop.
N.B. The morphological operation "Closing" means to dilate and then
erode -- it has the effect of closing holes or gaps. Using a square
kernel (all ones) is 4 times faster than using an elliptical kernel
(250µs for the
c2v.morphologyEx
vs. 1.1ms).TODO: