Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add horizontal grain profiles #748

Open
SylviaWhittle opened this issue Dec 5, 2023 · 11 comments
Open

Add horizontal grain profiles #748

SylviaWhittle opened this issue Dec 5, 2023 · 11 comments
Labels
enhancement New feature or request

Comments

@SylviaWhittle
Copy link
Collaborator

SylviaWhittle commented Dec 5, 2023

Is your feature request related to a problem? Please describe.
It has been requested that topostats have a feature similar to Gwyddion where the horizontal (in the image) height profile of a grain would be taken. In other words, a horizontal line is placed over the molecule, and the pixels that form the line have their heights measured to form a profile for that molecule.

Describe the solution you'd like
During grainstats, a horizontal, one pixel thick line of pixels should be formed, with the length sufficient to cover the entire molecule from left to right, and the line intersecting the centroid of the molecule. The heights for those pixels should be obtained and saved as a line plot. Probably only saved with image_set all.

While it does seem arbitrary to just take one line through the centre, and we will miss a lot of information, this is is not an issue, as this line profile is just to get an idea of the height profile of the molecule.

This is a heavily requested feature and should be quite high priority.
It should be relatively trivial to implement. (I think?)

@SylviaWhittle SylviaWhittle added the enhancement New feature or request label Dec 5, 2023
@slackline
Copy link
Contributor

slackline commented Dec 5, 2023

Ultimately the profile of any horizontal line is available as the Numpy array for the given grain so its "just" a case of finding the centroid of a grain (and skimage has functions for doing this) and extracting that row and the plotting.

Some random questions off the top of my head (not necessarily directed at you @SylviaWhittle ) as whilst its relatively straight-forward to implement a solution understanding the context and purpose of a feature can heavily influence the solution that is implemented (e.g. if other profiles based on the widest or thinnest part of a molecule are required then slices need taking from orientations other than horizontal).

At what point in the processing should we extract this information, is it from the untraced or traced grain? I suspect the former but would like to check

What should be done for linear molecules?

In the future is any other profile likely to be required other than that which passes through the centroid? If so will these always be horizontal or are other angles/paths required such as the profile through the widest part of a circular molecule1?

Footnotes

  1. I would be surprised if the orientation of a molecule is always as desired in the horizontal plane and is essentially random.

@alicepyne
Copy link
Member

This would be done before tracing, as an optional add on and would be used on "blob" grains (which are not usually traced) to give a profile spanning the grain and the background from outside to give a trace of the blob height. Useful add ons would be to make this average over a few lines either side or not, depending on the users preference. Horizontally is used to avoid masking errors.

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 5, 2023

Currently the classification of grains into circle or linear is done during tracing and there is no classification into blob.

A simple solution would be to just plot the required profile regardless of type, even if its meaningless, and that can be done, I was just curious at which stage the cross-section should be taken during the processing pipeline, for example after applying a Gaussian blur the values may be changed.

What is the intended use of the data? I ask as I'm curious about why only the horizontal axis is required.

These are the same grain with the same heights, but orientated differently.

0 0 0 0 0 
0 1 1 1 0
0 1 2 1 0
0 1 2 1 0
0 1 2 1 0
0 1 2 1 0
0 1 2 1 0
0 1 2 1 0
0 1 1 1 0
0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 1 1 1 1 0
0 1 2 2 2 2 2 2 1 0
0 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 

...and you'd get very different profiles from the horizontal only.

  • What information is sought/trying to be extracted?
  • Is there some sort of step prior to or during scanning that ensures grains are orientated in a specific way?

minicircle.spm looks fairly random to me but I'm very wary of generalising based on a single sample.

@alicepyne
Copy link
Member

alicepyne commented Dec 5, 2023 via email

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 5, 2023

Wouldn't you have heterogeneous measurements across a single molecule depending on the orientation though?

Would it perhaps be useful to use a slice along the line parallel to the minimum feret diameter which passes through the centroid as that would then pick up the height profile along (roughly1) the longest slice?

I'm probably over thinking this though so please do say but I'm trying to anticipate other scenarios and think about general solutions at the outset.

Footnotes

  1. I say roughly because there is no guarantee the extremes will always be through the centroid

@alicepyne
Copy link
Member

alicepyne commented Dec 5, 2023 via email

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 5, 2023

Further notes after todays meeting with @SylviaWhittle , @llwiggins and @MaxGamill-Sheffield

  • Two profiles passing through the centroid and aligning with the orientation of the minimum and maximum feret sound like the way to go.
  • Returning the raw data as well as making a plot would be sensible as it is then available for further analysis by end users.

This feature request should take priority over writing tests for the maxgamill-sheffield/cats branch which underpins the forthcoming paper which is due to be finalised 2024-01.

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 7, 2023

I've started work on this and have found something that is potentially useful, its written up in a separate issue, see #750.

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 7, 2023

Writing my thoughts down here before coding...

Algorithm

  1. Calculate feret distances
  2. Extract co-ordinates of minimum and maximum feret distances
  3. Rotate image so that horizontal aligns with minimum feret distance/coordinates and extract profile
  4. Rotate image so that horizontal aligns with maximum feret distance/coordinates and extract profile

How to do this

Scikit-image has methods/code for calculating feret distance (see skimage.measure._regionprops.py which we can extract out to a function...

from typing import Tuple
import numpy as np
import numpy.typing as npt

def calculate_feret(image: npt.NDArray) -> Tuple[npt.NDArray, npt.NDArray]:
    # Import necessary functions from skimage
    from skimage.morphology.convex_hull import convex_hull_image
    from skimage.measure import find_contours, marching_cubes

    image_convex = convex_hull_image(image)
    spacing =
    identity_convex_hull = np.pad(image_convex, 2, mode="constant", constant_values=0)
    if self._ndim == 2:
        coordinates = np.vstack(find_contours(identity_convex_hull, 0.5, fully_connected="high"))
    elif self._ndim == 3:
        coordinates, _, _, _ = marching_cubes(identity_convex_hull, level=0.5)
    distances = pdist(coordinates * self._spacing, "sqeuclidean")
    return distances, coordinates

We could then easily extract the co-ordinates for the min(distances) and max(distances) by
calculating...

min_feret_coordinates = coordinates[np.argwhere(np.min(distances) == distances],]
max_feret_coordinates = coordinates[np.argwhere(np.min(distances) == distances],]

These can be applied to the regionprops of a grain as extra_parameters in case we want any other statistics on the region or individually but would serve as useful functions to address #750 (which involves a bit more work).

Questions (mostly to myself)

  • Is rotation actually necessary?
  • Should the profile extend to the very edges of the cropped grain?

@ns-rse
Copy link
Collaborator

ns-rse commented Dec 7, 2023

Not so straight-forward as the following shows, the scikit-image code calculates pairwise distances (courtesy of scipy.spatial.distance.pdist() for all points and there will always be some zeros.

from skimage import draw
from skimage.morphology.convex_hull import convex_hull_image
from skimage.measure import find_contours, marching_cubes
from skimage.measure._regionprops_utils import _normalize_spacing
from scipy.spatial.distance import pdist, squareform

holo_circle = np.zeros((13, 13), dtype=np.uint8)
rr, cc = draw.circle_perimeter(6, 6, 5)
holo_circle[rr, cc] = 1
holo_circle

holo_ellipse_vertical = np.zeros((16, 10), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(8, 5, 3, 6, orientation=np.deg2rad(90))
holo_ellipse_vertical[rr, cc] = 1
holo_ellipse_vertical

holo_ellipse_horizontal = np.zeros((10, 16), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(5, 8, 6, 3, orientation=np.deg2rad(90))
holo_ellipse_horizontal[rr, cc] = 1
holo_ellipse_horizontal

holo_ellipse_angled = np.zeros((12, 14), dtype=np.uint8)
rr, cc = draw.ellipse_perimeter(6, 7, 3, 5, orientation=np.deg2rad(30))
holo_ellipse_angled[rr, cc] = 1
holo_ellipse_angled

curved_line = np.zeros((10, 10), dtype=np.uint8)
rr, cc = draw.bezier_curve(1, 5, 5, -2, 8, 8, 2)
curved_line[rr, cc] = 1
curved_line

def calculate_feret(image: npt.NDArray, spacing = None) -> Tuple[int, int]:
    """Calculate the feret distances.

    NB - This method is taken from that implemented in 'skimage.measure.regionprops()' which calculates the maximum
    feret distance.

    Parameters
    ----------
    image : npt.NDArray
        Labelled image.

    Returns
    -------
    Tuple
        Tuple of the distances and the corresponding co-ordinates.
    """
    image_convex = convex_hull_image(image)
    if spacing is None:
        spacing = np.full(image.ndim, 1.0)
    _spacing = _normalize_spacing(spacing, image.ndim)
    identity_convex_hull = np.pad(image_convex, 2, mode="constant", constant_values=0)
    if image.ndim == 2:
        coordinates = np.vstack(find_contours(identity_convex_hull, 0.5, fully_connected="high"))
    elif image.ndim == 3:
        coordinates, _, _, _ = marching_cubes(identity_convex_hull, level=0.5)
    print(f"{coordinates}")
    distances = pdist(coordinates * spacing, "sqeuclidean")
    return distances, coordinates

holo_circle_distances, holo_circle_coordinates = calculate_feret(holo_circle)

Fortunately I found in this issue reference to calculating the minimum feret distance for 2-D objects written up in a gist by @VolkerH so will give that a whirl.

ns-rse added a commit that referenced this issue Dec 11, 2023
New submodule `measures.feret` with functions for calculation of min and max feret and the associated co-ordinates.

Utilises code from Skan developer see [gist](https://gist.github.com/VolkerH/0d07d05d5cb189b56362e8ee41882abf) and as
[suggested](scikit-image/scikit-image#4817 (comment)) it adds tests. In doing so
I found sorting points prior to calculation of upper and lower convex hulls missed some points.

I'm also not sure about the `rotating_calipers()` function as it doesn't seem to return all pairs formed across the
points from the upper and lower convex hulls and so have included but not used the `all_pairs()` function which does,
although if used in place this doesn't return the minimum feret correctly, rather just the smallest distance.

Currently some of the tests for the `curved_line` fail too.

The intention is to use this without instantiating the `GrainStats` class to be used in profiles (#748) and could serve
as a concise stand-alone set of functions outside of `GrainStats` class which currently has static methods for the
calculations (#750).

Benchmarking

In light of #750 and re-implementing feret calculations as a new sub-module I wanted to see how this compares in terms
of performance to those in `GrainStats()` class so have run some very basic benchmarking. Note this is not the full
pipeline of taking labelled images and finding the outlines/edge points which are required as inputs to the
calculations, its purely on the calculation of min-max feret from the edge points.

```
import timeit

import numpy as np
from skimage import draw

from topostats.measure import feret
from topostats.grainstats import GrainStats

holo_circle = np.zeros((14, 14), dtype=np.uint8)
rr, cc = draw.circle_perimeter(6, 6, 5)
holo_circle[rr, cc] = 1

holo_circle_edge_points = np.argwhere(holo_circle == 1)

%timeit feret.min_max_feret(holo_circle_edge_points)

83 µs ± 686 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit GrainStats.get_max_min_ferets(holo_circle_edge_points)

1.06 ms ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
```

So this new implementation is faster.
ns-rse added a commit that referenced this issue Dec 11, 2023
New submodule `measures.feret` with functions for calculation of min and max feret and the associated co-ordinates.

Utilises code from Skan developer see [gist](https://gist.github.com/VolkerH/0d07d05d5cb189b56362e8ee41882abf) and as
[suggested](scikit-image/scikit-image#4817 (comment)) it adds tests. In doing so
I found sorting points prior to calculation of upper and lower convex hulls missed some points.

I'm also not sure about the `rotating_calipers()` function as it doesn't seem to return all pairs formed across the
points from the upper and lower convex hulls and so have included but not used the `all_pairs()` function which does,
although if used in place this doesn't return the minimum feret correctly, rather just the smallest distance.

Currently some of the tests for the `curved_line` fail too.

The intention is to use this without instantiating the `GrainStats` class to be used in profiles (#748) and could serve
as a concise stand-alone set of functions outside of `GrainStats` class which currently has static methods for the
calculations (#750).

Benchmarking

In light of #750 and re-implementing feret calculations as a new sub-module I wanted to see how this compares in terms
of performance to those in `GrainStats()` class so have run some very basic benchmarking. Note this is not the full
pipeline of taking labelled images and finding the outlines/edge points which are required as inputs to the
calculations, its purely on the calculation of min-max feret from the edge points.

```
import timeit

import numpy as np
from skimage import draw

from topostats.measure import feret
from topostats.grainstats import GrainStats

holo_circle = np.zeros((14, 14), dtype=np.uint8)
rr, cc = draw.circle_perimeter(6, 6, 5)
holo_circle[rr, cc] = 1

holo_circle_edge_points = np.argwhere(holo_circle == 1)

%timeit feret.min_max_feret(holo_circle_edge_points)

83 µs ± 686 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit GrainStats.get_max_min_ferets(holo_circle_edge_points)

1.06 ms ± 10.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
```

So this new implementation is faster.
@ns-rse
Copy link
Collaborator

ns-rse commented Feb 13, 2024

See comments in #755 (in particular here).

The algorithm proposed in the Mathworks article works for everything except triangles and it took me a while (most of yesterday) to work through why, carefully comparing the method @SylviaWhittle implemented in GrainStats.get_min_max_feret() and the vectorised approach I'm working on in measure.feret sub-module.

Step get_min_max_feret() get_min_max_feret() feret.min_feret_triangles() feret.min_feret_triangles() Match?
Upper Hull Lower Hull Upper Hull Lower Hull
[[1, 1], [[1, 1] [[1, 1], [[1, 1], Yes
[1, 2], [2, 1]] [1, 2], [2, 1]]
[2, 1]] [2, 1]]
Rotating [[2, 1], [1, 1]] [[2, 1], [1, 1]] Not in order but the same set of pairs.
Caliper [[2, 1], [1, 2]] [[2, 1], [1, 2]]
Pairs [[1, 1], [1, 2]] [[1, 1], [1, 2]]
Itter 1 Same points, different hulls
Base 1st [1, 1] [1, 1]
2nd [1, 2] [1, 2]
Itter 2 Same points, different hulls
Base 1st [2, 1] [2, 1]
2nd [1, 1] [1, 1]
Itter 3 NO!!! This is the problem.
Base 1st [1, 2] [1, 1]
2nd [2, 1] [2, 1]

The clause is because of the condition...

next_index = 1 if index == n_pairs - 1 else index + 1

...which on the final iteration means the upper hull is using the last element ([1, 1]) and the first element ([2, 1]) which are the same points on the lower hull which have already been seen as the base. It also means that the base formed by [1, 2] and [2, 1] has not been observed which is why I get the minimum height returned as 1.0 rather than 0.7071067811.

I can't work out a way of adapting this algorithm so have gone with @SylviaWhittle solution of calculating the minimum feret based on the triangle formed by a rotating caliper pair (of which one point is the apex) and the next adjacent point on one of the hulls (albeit with some tweaks to the code layout).

Minimum Feret Co-Ordinates

After some revision the stated desire is to have...

Two profiles passing through the centroid and aligning with the orientation of the minimum and maximum feret sound like the way to go.

This is probably most easily obtained by determining the mid-point between the maximum feret co-ordinates and taking a profile that is prependicular to the maximum feret at this point.

But I can envisage two issues based on the limited sample of images I've seen.

  1. The minimum feret might be some distance from the centroid, if there is a "twist" in a loop forming what looks like a figure of eight the profile from taking the profile perpendicular to the mid-point of the maximum feret, whilst consistent across molecules may not be representative/informative of the minimum feret.
  2. The minimum feret itself isn't based on the caliper points, but a pair of calipers and an adjacent convex hull point. As such the caliper doesn't represent the co-ordinates of the minimum feret, the apex and the mid-point between adjacent hull points does. This can be calculated exactly but will on most occassions not align to co-ordinates in the numpy array and so an approximation would have to be used to get those co-ordinates.

For 1) I think perhaps returning both the suggested smaller profiles might be helpful, and a clear description of what these represents should be included in the documentations data dictionary that describes the grain statistics that are returned.

For 2) I propose taking the nearest co-ordinate to the mid-point of the base of the triangle formed in calculating the minimum feret distance and returning the profile between this and the apex of the triangle.

A very crude schematic to maybe aid with visualisation (would take too long for me to plot this out).


--- = parallel lines forming minimum feret
~~~ = skeleton (jelly bean like shape)
|   = Minimum feret distance
*   = Antipodal Calipers
@   = Third point on convex hull forming triangle 
+   = Point to be estimated

-------------------------------
             ~*~
         ~~~~ | ~~~~
      ~~~    /|\     ~~~
   ~~       / | \       ~~ 
  ~        /  |  \        ~
  ~~      / ~~|~~ \      ~~ 
    ~~~~~@~~  +  ~~*~~~~~
-------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants