Adds optimised algorithm for Median Filter #34

hari-sikchi · 2017-07-07T21:18:53Z

Fast Median filter based on algorithm as explained in paper: http://ieeexplore.ieee.org/document/1163188/?reload=true

Algorithm in short:
Algorithm : Huang’s O(n) median filtering algorithm.
Input: Image X of size m × n, kernel radius r
Output: Image Y of the same size as X
Initialize kernel histogram H
for i = 1 to m do
for j = 1 to n do
for k = −r to r do
Remove X[i+k,j−r−1] from H
Add X[i+k,j+r] to H
end for
Y[i,j] ← median(H)
end for
end for

Improves the time complexity from O((ImageDims) x (kernelSize x kernelSize x log(kernelSize))) to constant(k) x O((ImageDims) x (kernelSize)

Currently, the filter is running slow for small filters than the original one and I have observed for 512x512 images, the new filter runs faster for kernels greater than 7x7 size. I think the code if written in a more optimized way can help make it run faster also for smaller kernels.

I am fairly new to Julia and would be grateful for help to make it faster.

timholy · 2017-07-07T22:09:12Z

Cool! I'd be excited to have a more efficient algorithm for large kernels.

Before I do a detailed review, maybe a couple of high-level comments. First, it looks like you've changed the generic implementation to use a histogram- and mode-based approach. This is clearly useful for median-filtering, but I doubt this will be helpful for, e.g, mapwindow(std, img, window). I'm wondering if it makes sense to introduce a trait-function

uses_histogram(::typeof(median!)) = true
uses_histogram(::Any) = false

and then branch to a different implementation whenever uses_histogram(f) evaluates to true. (It will be important to make sure that out is allocated as having the same type for either algorithm.)

Second, it sure would be nice to have this work in more than 2d, but I think this implementation is quite specific to 2d, right? If so we should also limit this to just that case (or generalize it to higher dimensions).

Third, perhaps @tejus-gupta might chime in? See JuliaImages/Images.jl#639 (comment)

hari-sikchi · 2017-07-07T22:16:33Z

Having a trait function would be definitely better. I'll make those changes and update the pull request.

Yes, for now it is specific to 2D. I'll see if I can generalise it to nD.

timholy · 2017-07-07T22:20:33Z

I'll see if I can generalise it to nD.

If you haven't read this blog post, it might help. Seems that if you just use an n-1-dimensional CartesianIndex for the trailing dimensions, it might do much of it for you?

hari-sikchi · 2017-07-09T12:32:50Z

I have updated the commit with support for N-Dimensional inputs also added the uses_histogram function to maintain the generic implementation for mapwindow.
I had some doubts about the code:-

Can I use CartesianRange to traverse backward.
Allowing this algorithm to take n-Dimensional input is making it run much slower(maybe some problem in code) for 2-D input than it did before. Should algorithms be different for different dimensions?
This algorithm uses bucket sort for sorting and this makes up a large hidden constant(256). So it runs slower on smaller kernels than simple sorting and finding the median. Again, should there be different algorithms for kernels of different size?
This algorithm uses a zig zag path for traversal.eg.
------------------------------------->
<------------------------------------
--------------------------------------->
Is there any way for traversing these type of way efficiently?
Also for ndims(input)>=4 .The original mapwindow function starts printining N=ndims(Input) in a loop. Is there a reason for that?

tejus-gupta · 2017-07-09T17:11:49Z

@hari-sikchi You should also read this paper. I think it would be best to write separate functions for cases where the color values are discrete and when they are continuous.

For discrete color levels, the paper you cited works in O(n) per pixel while this paper works in O(1) per pixel. A 2D implementation is simple but there are some trade-offs involved with n-D extensions. With constant time median filtering algorithm, the memory requirement grows exponentially with number of dimensions. For both the algorithms, we can't use histograms with continuous values images and need to store the values in a BST. This will increase the time complexity to O(nlogn) and O(logn).
Any of these versions should have far better performance than the mapwindow version.

As far as I understand, you are primarily facing difficulty in iterating in zig-zag manner in higher dimensions. You may want to benchmark if there is any benefit of doing this rather than simply re-initializing the histogram for the next row.

hari-sikchi · 2017-07-09T17:33:55Z

I agree, I am converting the code to column major for now. Let's see whether reinitializing the histogram can make it faster.

tejus-gupta · 2017-07-09T17:34:05Z

@timholy Are there functions to compute union and difference for CartesianRange? I think it would be easier to keep track of pixels to add/subtract if we could simply compute union/difference of CartesianRange as we move to next pixel.

Wouldn't it be cleaner(and easier to debug) to write a separate median_filter function?

hari-sikchi · 2017-07-09T19:43:21Z

@tejus-gupta I have made changes in the code to be column major and obtained significant time reduction. Also, zig-zag traversal increases the time complexity. In the paper, for constant time median filter per pixel we don't need to have BST or union data structures. Only requirement is more memory which increases with dimension largely as:dim[2] x dim[3] x dim[4] x .... x dim[n] .

The point of inflection is around 9 x 9 filter size for 512 x 512 input, after that the new implementation runs faster.

timholy

This is looking promising! Still work to do, I'm afraid. I've tried to be pretty explicit about some design changes that will hopefully make testing robust.

I agree with @tejus-gupta that we may need separate implementations for continuous values (Float32 and Float64) vs discrete values. We may want to restrict this signature to Union{N0f8,Gray{N0f8}} eltypes, for example. (In which case my comments about "what about values outside the 0 to 1 range?" are not relevant anymore.)

timholy · 2017-07-12T15:00:05Z

src/mapwindow.jl

 function mapwindow{T,N}(f,
                        img::AbstractArray{T,N},
                        window::Indices{N},
                        border::BorderSpecAny;
                        callmode=:copy!)
-    _mapwindow(replace_function(f), img, window, border, default_shape(f); callmode=callmode)
+    if(is_medianfilter(f))


Instead of introducing is_medianfilter, a more "Julian" approach would be to change the signature of your median_filter function below to be the following method:

function mapwindow{T,N}(f::typeof(median!),

But I actually have a slightly different suggestion, see below.

timholy · 2017-07-12T15:30:36Z

src/mapwindow.jl

+                         border::BorderSpecAny,
+                         shape=default_shape(f);
+                         callmode=:copy!)
+    inds = indices(img)


Since the original algorithm is faster for smaller window sizes, you could insert something like this:

if prod(map(length, window)) < 81 return _mapwindow!(...) end

and keep the original implementation as well.

However, this makes testing a little harder: you'd like to check, for example, that the two algorithms give the same result as one another, and that means you need to have a way of calling each one separately.

For that reason, I recommend something like the following:

# Let's change the syntax of `replace_function`. First, fallbacks in case anyone has already made a local extension. replace_function(f, window) = replace_function(f) replace_function(f) = f # Now the one you care about function replace_function(::typeof(median!), window) if prod(map(length, window)) < 81 return median_direct! end median_histogram end

Then change all callers of replace_function(f) to replace_function(f, window), define

function median_direct!(v::AbstractVector) inds = indices(v,1) Base.middle(Base.select!(v, (first(inds)+last(inds))÷2, Base.Order.ForwardOrdering())) end

and define median_histogram to be your definition. Then you can call this method

function _mapwindow{T,N}(f::typeof(median_histogram),

Finally, in tests you can do this:

med1 = ImageFiltering._mapwindow(ImageFiltering.median_direct!, args...) med2 = ImageFiltering._mapwindow(ImageFiltering.median_histogram, args...) @test med1 == med2

I have created new dispatches for different methods and separated implementations for discrete and continuous inputs. Do suggest, if there can be a better dispatch method.
Presently:

Two new functions are added
a. median_direct!
b. median_histogram!

Only for img::Union{AbstractArray{N0f8,N},AbstractArray{ColorTypes.Gray{N0f8},N}} is a choice made between median_histogram! and median_direct!.

timholy · 2017-07-12T15:35:24Z

src/mapwindow.jl

+    else
+
+        window_size=size(v,1)
+        dims = map(x->x.stop-x.start+1,window)


Better to use the "official interface" of ranges rather than accessing fields directly---use first(x) and last(x) rather than x.start and x.stop. (If someone renames the fields or redefines the implementation, they can't deprecate your direct access of the fields whereas they can make sure that first and last continue to work properly. This actually happened not so long ago with LinSpace and StepRangeLen.)

However, in this particular case I think this is equivalent to map(length, window), and that's much clearer for readers of your code.

timholy · 2017-07-12T15:39:34Z

src/mapwindow.jl

+        width=window[1].stop-window[1].start+1
+        inds = indices(v,1)
+        if mode == 0
+        # Update the histogram according to new entries


indent comment too

timholy · 2017-07-12T15:43:39Z

src/mapwindow.jl

+        if mode == 0
+        # Update the histogram according to new entries
+            for i = first(inds):last(inds)
+                id= trunc(Int64,(v[i]*255))+1


What if id turns out to be < 1 or > 256? (You'll get a boundserror.) See a similar discussion here (click to expand the comments, and check the diff to see how @tejus-gupta solved the issue by calling extrema).

Would be good to add a test where the image has one value outside the 0-to-1 range.

If the median_histogram method is restricted to N0f8, that won't be a problem, right?

timholy · 2017-07-12T15:47:16Z

src/mapwindow.jl

+        # Compute the median
+            tempsum = 0
+            m_index=-1
+            for i = 1:256


A lot of this hard-codes the fact that histogram has length 256, but note that histogram is not even allocated in this function so people changing code in one place might cause bugs in a different place.

for in in linearindices(histogram) would be more general.

timholy · 2017-07-12T15:48:32Z

src/mapwindow.jl

+            for i =1:256
+                counter+=m_histogram[i]
+            end
+        # Compute the median


Seems like this should be a standalone function, since you do the same thing again below.

timholy · 2017-07-12T15:49:58Z

src/mapwindow.jl

+                    id= trunc(Int64,(v[j]*255))+1
+                    m_histogram[id]-=1
+                    if(m_histogram[id]<0)
+                        println("stop")


Is this just for debugging? How about @assert m_histogram[id] >= 0? I think julia -O3 disables assertions, so it's possible to be both safe and have it not hurt your runtime.

Yes, pushed old commit by mistake

timholy · 2017-07-12T15:51:31Z

src/mapwindow.jl

+
+        elseif mode == 1
+        # Update the histogram according to new entries
+            for i =  width:width:dims[1]*dims[2]


Still 2d, right? How about prod(dims)?

No, this isn't 2D. The iterator 'i' iterates through all the new values added in all dimensions .

for i = first(inds):width:dims[1]*dims[2] [This iterates over a 2D slice] for j= i: dims[1]*dims[2]:last(inds) [This iterates over all such 2D slices in nD]

timholy · 2017-07-12T15:54:34Z

src/mapwindow.jl

+                prev_col=curr_col
+
+                Rwin = CartesianRange(map(+, window, I.I))                                
+                copy!(buf, Rbuf, img, Rwin)


Maybe I'm missing something, but won't this alone keep your method O(n^2) in 2d? You're copying all the values even if you only "use" some of them.

Yes, that is true. Sadly, when I am trying to break Rwin in :
Rwin_add= CartesianRange(map(+, (window_array_add...), I.I)) [ Side column to add]
Rwin_subtract = CartesianRange(map(+, (window_array_subtract...), I.I)) [Side column to remove]
is leading to more running time. I can't quite figure out why that is happening. I am working on reducing the time complexity more.

hari-sikchi · 2017-07-13T13:25:00Z

I am on a flight, will fix this as soon as I reach back.

timholy · 2017-07-28T08:41:47Z

Let me know if you're still working on this or ready for a re-review.

timholy · 2017-07-28T09:04:16Z

Sorry, I hadn't noticed your responses to my review comments since they got folded. Since the file structure has been reorganized GitHub doesn't let me add on to the existing threads, so I'll just reply by quoting:

Two new functions are added...Do suggest, if there can be a better dispatch method.

Workable in principle, but currently your median_histogram! function makes too many unchecked assumptions (image values range between 0 and 1, m_histogram has 256 "slots") to be exportable on its own. If we really want to create something that works for any numeric/grayscale type, I think there probably have to be some pretty fundamental revisions. For example, rather than using discrete histograms, we could come up with some type that e.g. maintains a cyclic buffer of sorted columns and then allows you to compute the median by figuring out which column contains the true median (ideally not "median-of-medians," but a more sophisticated algorithm that will check non-median values in individual columns, if necessary, to compute the true median). I don't know of a reference for this but it seems possible (if complex) in principle.

But the alternatives are (1) to support only N0f8, or (2) check the assumptions explicitly and throw informative errors when they are violated.
Sadly, when I am trying to break Rwin ...is leading to more running time.

Perhaps check the results of @code_warntype. I bet you have a type-instability somewhere.
If the median_histogram method is restricted to N0f8, that won't be problem right?

Correct. An even better way to phrase that line might be id = reinterpret(gray(v[i])) + 1, which exploits the "raw" integer representation underlying N0f8 directly.

hari-sikchi · 2017-08-05T15:07:43Z

I have changed the mapwindow dispatch function to work for types img::Union{AbstractArray{N0f8,N},AbstractArray{ColorTypes.Gray{N0f8},N}} and removed type incompatibilities.
Still, when I am trying to make it run in O(n) complexity by breaking the copy method, It's running type is increasing significantly. Progress is slow as I am dealing with intern examinations right now. I have attached the @code_warntype result after breaking the copy function.

The code for complete O(n) complexity[after breaking the copy function] looks somewhat like this : (https://gist.github.com/hari-sikchi/b965eec72d45c43392512b2b9d970b82)

Add Fast Median Filter

b05688d

hari-sikchi added 3 commits July 8, 2017 13:18

Change median_filter to a seperate function

f0351ac

Add nD support for median filter

376cfc4

Improved time for nD fast median filter

d73c56e

Changed traversal to column major

3bbafaf

hari-sikchi added 3 commits July 10, 2017 12:23

Clear Definitions

e1735ff

changed median statistic

5ecf8ef

changed median statistic

99da897

timholy reviewed Jul 12, 2017

View reviewed changes

hari-sikchi added 2 commits July 24, 2017 02:10

Update program structure

56c4457

Separate dispatch for continuos and discrete input

6c9bf74

removed type incompatibility

7e80f26

timholy mentioned this pull request Mar 4, 2020

Add imfilter benchmarks #158

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds optimised algorithm for Median Filter #34

Adds optimised algorithm for Median Filter #34

hari-sikchi commented Jul 7, 2017 •

edited

timholy commented Jul 7, 2017

hari-sikchi commented Jul 7, 2017

timholy commented Jul 7, 2017

hari-sikchi commented Jul 9, 2017 •

edited

tejus-gupta commented Jul 9, 2017 •

edited

hari-sikchi commented Jul 9, 2017

tejus-gupta commented Jul 9, 2017 •

edited

hari-sikchi commented Jul 9, 2017 •

edited

timholy left a comment

timholy Jul 12, 2017

timholy Jul 12, 2017

hari-sikchi Jul 27, 2017

timholy Jul 12, 2017

timholy Jul 12, 2017

timholy Jul 12, 2017

hari-sikchi Jul 27, 2017 •

edited

timholy Jul 12, 2017

timholy Jul 12, 2017

timholy Jul 12, 2017

hari-sikchi Jul 27, 2017

timholy Jul 12, 2017

hari-sikchi Jul 27, 2017 •

edited

timholy Jul 12, 2017

hari-sikchi Jul 27, 2017

hari-sikchi commented Jul 13, 2017

timholy commented Jul 28, 2017

timholy commented Jul 28, 2017 •

edited

hari-sikchi commented Aug 5, 2017 •

edited

Adds optimised algorithm for Median Filter #34

Are you sure you want to change the base?

Adds optimised algorithm for Median Filter #34

Conversation

hari-sikchi commented Jul 7, 2017 • edited

timholy commented Jul 7, 2017

hari-sikchi commented Jul 7, 2017

timholy commented Jul 7, 2017

hari-sikchi commented Jul 9, 2017 • edited

tejus-gupta commented Jul 9, 2017 • edited

hari-sikchi commented Jul 9, 2017

tejus-gupta commented Jul 9, 2017 • edited

hari-sikchi commented Jul 9, 2017 • edited

timholy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hari-sikchi Jul 27, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hari-sikchi Jul 27, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hari-sikchi commented Jul 13, 2017

timholy commented Jul 28, 2017

timholy commented Jul 28, 2017 • edited

hari-sikchi commented Aug 5, 2017 • edited

hari-sikchi commented Jul 7, 2017 •

edited

hari-sikchi commented Jul 9, 2017 •

edited

tejus-gupta commented Jul 9, 2017 •

edited

tejus-gupta commented Jul 9, 2017 •

edited

hari-sikchi commented Jul 9, 2017 •

edited

hari-sikchi Jul 27, 2017 •

edited

hari-sikchi Jul 27, 2017 •

edited

timholy commented Jul 28, 2017 •

edited

hari-sikchi commented Aug 5, 2017 •

edited