Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework gabor filter and add log gabor filter #235

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

johnnychen94
Copy link
Member

@johnnychen94 johnnychen94 commented Nov 1, 2021

The previous Kernel.gabor function has three drawbacks:

  • it eagerly allocates memory, and thus makes optimization quite difficult
  • it returns Tuple{Matrix, Matrix} as real and imaginary part; we should just return Matrix{Complex{Float64}} instead
  • it doesn't have good documentation or examples... It takes quite a while for me to understand the usage of it from other resources...

benchmark on gabor filter:

using ImageFiltering

# new lazy Gabor array
kern = Kernel.Gabor((256, 256), 30, 0)
@btime collect($kern); # 1.157 ms (2 allocations: 1.00 MiB)

# old gabor function
σ, θ, λ, γ, ψ = 10, 0, 30, 0.5, 0
@btime Kernel.gabor(256,256,σ,θ,λ,γ,ψ); # 2.911 ms (51 allocations: 12.60 MiB)

Docs Preview: https://juliaimages.org/ImageFiltering.jl/previews/PR235/demos

TODO:

  • add tests
  • revisit the implementation details
  • add doc for log gabor
  • add function reference entries to documentation

The implementation of the log gabor filter heavily follows https://www.peterkovesi.com/matlabfns/PhaseCongruency/Docs/convexpl.html so I'm wondering if @peterkovesi could help review this.

@codecov
Copy link

codecov bot commented Nov 1, 2021

Codecov Report

Merging #235 (e0f182a) into master (424523c) will increase coverage by 0.30%.
The diff coverage is 100.00%.

❗ Current head e0f182a differs from pull request most recent head b1981ea. Consider uploading reports for the commit b1981ea to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #235      +/-   ##
==========================================
+ Coverage   92.14%   92.45%   +0.30%     
==========================================
  Files          12       14       +2     
  Lines        1642     1709      +67     
==========================================
+ Hits         1513     1580      +67     
  Misses        129      129              
Impacted Files Coverage Δ
src/kernel.jl 99.02% <ø> (-0.21%) ⬇️
src/gaborkernels.jl 100.00% <100.00%> (ø)
src/kernel_deprecated.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 424523c...b1981ea. Read the comment docs.

This commit introduces an lazy array interface for the Gabor filter,
benchmark shows that it's 2-3x faster than the previous version.

The documentation is heavly rewrote to explain this filter and all
its parameters.
@johnnychen94
Copy link
Member Author

johnnychen94 commented Nov 2, 2021

I'm marking this as ready as I don't plan to add any further major changes to this PR. I'm going to add phase congruency filter next, and then finally the FSIM quality index JuliaImages/ImageQualityIndexes.jl#25. I'll keep this PR pending until everything downstream is ready.

ImagePhaseCongruency.jl is really a good reference but it has a strong MATLAB legacy, I believe we can get things faster and give it better composability by properly resolving and designing the function.

@johnnychen94
Copy link
Member Author

johnnychen94 commented Nov 3, 2021

Benchmark shows that F32 on LogGabor is somewhat slow. I'll try to investigate it before merging this.

    BenchmarkGroup:
        2-element BenchmarkTools.BenchmarkGroup:
          tags: []
          "Gabor" => 6-element BenchmarkTools.BenchmarkGroup:
                  tags: []
                  "Gabor_ComplexF32_19×19" => Trial(5.625 μs)
                  "Gabor_ComplexF32_256×256" => Trial(994.030 μs)
                  "Gabor_ComplexF32_512×512" => Trial(3.857 ms)
                  "Gabor_ComplexF64_19×19" => Trial(7.870 μs)
                  "Gabor_ComplexF64_256×256" => Trial(1.440 ms)
                  "Gabor_ComplexF64_512×512" => Trial(5.553 ms)
          "LogGabor" => 6-element BenchmarkTools.BenchmarkGroup:
                  tags: []
                  "LogGabor_ComplexF64_19×19" => Trial(13.903 μs)
                  "LogGabor_ComplexF64_256×256" => Trial(2.422 ms)
                  "LogGabor_ComplexF64_512×512" => Trial(9.884 ms)
                  "LogGabor_ComplexF32_19×19" => Trial(68.269 μs)
                  "LogGabor_ComplexF32_256×256" => Trial(11.896 ms)
                  "LogGabor_ComplexF32_512×512" => Trial(51.113 ms)

Writing benchmark isn't something pleasant, but when you get the report, oh that's a surprise!

@peterkovesi
Copy link

I am very pleased and impressed that this work is being done. Please help yourself to any code you think is useful in ImagePhaseCongruency I am sure much of it can be considerably improved and made more interoperable with the rest of the Images ecosystem.

Another function from ImagePhaseCongruency that I think is deserving of being included in Images is perfft2(). It is very useful when one wants to perform filtering in the frequency domain with minimal artifacts introduced from the boundary discontinuities. See https://peterkovesi.github.io/ImagePhaseCongruency.jl/dev/examples/#Fourier-transform-of-Moisan's-periodic-image-component-1

Nate that I am now retired and find myself not doing very much coding these days. Indeed at the moment I am often off the grid for extended periods of time. Thus I am not in a position to provide code reviews but should you have any questions about my code I will be more than happy to try to answer them. In the meantime I will be cheering you on from the sidelines!

Comment on lines +220 to +226
if kern.normalize
# normalize: from reference [1] of LogGabor
# By changing division to multiplication gives about 5-10% performance boost
x, y = (x, y) .* kern.freq_scale
end
ω = sqrt(x^2 + y^2) # this is faster than hypot(x, y)
θ = atan(y, x)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is often the case that we can pre-compute the frequency ω and orientation θ and then call kern[ω, θ]. In this case, we might want to introduce a sphere index type to cache the computation and make the getindex dispatch on it.

for p in SphereIndices(img)
    # by doing this, we only need to compute the sphere index once
    kern1[p] * kern2[p]
end

Because some kernels are defined orientation-invariant, they wouldn't need the θ information at all. Thus each component of SphereIndex must be computed lazily so that we don't compute θ if we don't need it.

cc @Tokazama for a possible use case for JuliaArrays/SpatioTemporalTraits.jl#11. This is kind of the opposite of the CompositeSystem you proposed.

Copy link

@Tokazama Tokazama Nov 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've long wanted to be able to have a method for explicitly extracting spatial indices and this is a great practical example that motivates an implementation. We need some way of having a spatial system in the midst of color, time, etc. which users may end up creating a view of the array that splits up or permuted relevant dimensions. The CompositeSystem proposal was an attempt to accommodate that, but if we did go in that direction we would still need to be able to extract a succinct representation of just the spatial system (e.g., SphericalSystem).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do already have coords_spatial and ImageAxes.filter_space_axes. I'm not a fan of color axes because having one destroys some of our most important abstractions (one array element = one pixel), but I recognize we may want to support such constructs slightly better than we do now.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do already have coords_spatial and ImageAxes.filter_space_axes.

I thought those methods were only reliable if the type encoding the coordinate system is kept at the top level and it was a cartesian system, like we usually assume with AxisArray. Could we rely on that approach if the coordinate system was encoded in nested arrays or the spatial indices couldn't just be permuted or sliced (e.g., one dimension was for the radius and the other was for the azimuth)?

I'm not a fan of color axes because having one destroys some of our most important abstractions

I agree, but it would be good to play nice with the machine learning community on that one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fix the nesting in the same way you've been doing it in ArrayInterface. Don't need a new trait that does the same thing, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But yeah, we should provide better support for ML. Not by changing our algorithms, but by making it easier to represent their raw data in our framework. Making algorithms work for arrays that have a color channel is messy and error-prone, because there is no way to know how to interpret that channel unless it's annotated unambiguously. In which case, why not just change the representation to get rid of the color channel?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we wanted to facilitate what was proposed we could do something like this

SpatialIndices(x) = SpatialIndices(CoordinateSystem(x), x)
SpatialIndices(cs::SphericalSystem, x) = SphereIndices{SphereIndex}(cs, indices_spatial(x))

We still use existing traits (indices_spatial), but we need more information to know which indices correspond to the radius, azimuth, and zenith so we can compose each SphereIndex.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because some of the computation can be non-trivial, can this design handle the task "I only want to iterate over the first dimension of the grid so if possible, don't do the extra computation for the second dimension"?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding everything correctly here, you have two unique sets of coordinate systems with corresponding spatial indices/coordinates.
I would think that's sufficient to have a function like this:

function map_spatial_indices(dest, dest_cs::CartesianSystem, src, src_cs::SphericalSystem)
    ...
end

That function certianly isn't pretty and it would be nice if we could make it work through getindex.

Currently we don't have very native CUDA support for these two kernels,
so the test set mainly serves as "okay, we can convert the lazy kernel
array to CuArray by following the steps in tests and then use GPU to do
multiplication and fft"
@johnnychen94
Copy link
Member Author

@peterkovesi Thanks for the information! I've made an initial implementation of the phasecong3 and it turns out that I only get 2x slower 😂 because of the repeated computation as I noted in #235 (comment). The idea of separating the log Gabor frequency filter and the log Gaussian angular filter here to decouple the orientation and scale loops is really really insightful.

Profiling my initial implementation code shows that almost 80% of the computation is spent on the log Gabor filter, which is definitely a performance hotspot.

That said, this PR is far from ready or satisfying.

Copy link
Member

@timholy timholy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice! One overall comment (multidimensional police reporting for duty, sir 😄): how about parametrizing this as in https://stackoverflow.com/a/25633868/1939814? You can of course rotate the coordinates before applying that formula, so it might make sense to also allow a rotation transformation.

The value `λ=2` should not be used in combination with phase offset `ψ=-π/2` or `ψ=π/2`
because in these cases the Gabor function is sampled in its zero crossings.

In order to prevent the occurence of undesired effects at the image borders, the wavelength
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In order to prevent the occurence of undesired effects at the image borders, the wavelength
In order to prevent the occurrence of undesired effects at the image borders, the wavelength

because in these cases the Gabor function is sampled in its zero crossings.

In order to prevent the occurence of undesired effects at the image borders, the wavelength
value should be smaller than one fifth of the input image size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this specific number come from? Is this a categorical difference? I.e., do you get "undesired effects" for 4.9 but none for 5.1?

## `kernel_size::Dims{2}` or `kernel_axes::NTuple{2,<:AbstractUnitRange}`

Specifies either the size or axes of the output kernel. The axes at each dimension will be
`-r:r` if the size is odd.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's r? And what happens in the even case?

## `orientation::Real`(θ∈[0, 2pi]):

This parameter specifies the orientation of the normal to the parallel stripes of a Gabor
function [3].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add "0 would indicate vertical (first-axis) stripes." (If that's correct.)

Comment on lines +220 to +226
if kern.normalize
# normalize: from reference [1] of LogGabor
# By changing division to multiplication gives about 5-10% performance boost
x, y = (x, y) .* kern.freq_scale
end
ω = sqrt(x^2 + y^2) # this is faster than hypot(x, y)
θ = atan(y, x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do already have coords_spatial and ImageAxes.filter_space_axes. I'm not a fan of color axes because having one destroys some of our most important abstractions (one array element = one pixel), but I recognize we may want to support such constructs slightly better than we do now.


@inline Base.size(kern::Gabor) = map(length, kern.ax)
@inline Base.axes(kern::Gabor) = kern.ax
@inline function Base.getindex(kern::Gabor, x::Int, y::Int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we index in y, x format?

xr = x*c + y*s
yr = -x*s + y*c
yr = γ * yr
return exp(-(xr^2 + yr^2)/(2σ^2)) * cis(xr*λ_scaled + ψ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we index this with out-of-bounds x and y? If lack of BoundsError is deliberate, it should be documented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants