[feature] : Better Tracing & Skeletonisation Merger #800

MaxGamill-Sheffield · 2024-02-12T11:36:13Z

Is your feature request related to a problem?

Users currently require the better skeletonisation, pruning and tracing found within the maxgamill-sheffield/cats branch along with the height-traces Sylvia has made together into one branch.

Describe the solution you would like.

The coding group has decided the best approach is to cherry pick the relevant code (a.k.a no topology stuff yet, or NodeStats), and push these into main, then when we have the time we can write tests. This will hold off any releases to PyPi in the meantime as the Software will be untested.

The tasks are as below:

Current ToDo list:

Outputs

Get advice from experimentalists - How do you want the data saving in the csv if any? Due to multiple molecules per grain.
Get Sylvia to check and comment on the hdf5 structure.
Return crops and overlay traces.

Skeletons

Look into skeleton bias not seeming to work.

Tracing

Look into why fitted traces don't lie on the backbones (ordered traces do).
Get the contingency using ordinary tracing working - need to handle extra dict fields that aren't generated this way.

Config

Possibly move the save_fig config dpi param to the plotting dict as a high setting takes a while but are needed for skeletal plots. Or change so "null" takes the plotting config defaults, not "Figure".
Work out something for the mask_cmap to produce > 1 colour for crossing plots.

Code Tidy

Add docstrings.
Maybe tidy up the behemoth analyse_nodes function.

`pre-commit` hooks

The following checks have been disabled whilst @ns-rse started working on adding/developing tests (under #818) and will need re-enabling and the problems/errors they highlight addressing (doing so off the bat would be a massive amount of work and @ns-rse is focusing on writing unit-tests for the changes on this branch).

Specific hooks can be run on specific files with the following.

pre-commit run <hook-name> --file <path-to-file>

Once they pass the relevant checks the task can be checked off.

`numpydoc-validation`

Disabled for the following files by adding patterns to pyproject.toml configuration file. Lines that exclude are noted to assist with their removal. Once removed run the pre-commit run numpydoc-validation --file <path-to-file> locally to check everything.

topostats/tracing/dnatracing.py (line 275)
topostats/tracing/nodestats.py (line 276)
topostats/tracing/tracinfuncs.py (line 277) - may be redundant as code may have moved to other modules.
topostats/theme.py (line 279)

`codespell`

The following files have been excluded from this check for now, the relevant line to remove them from in pyproject.toml is 258...

[tool.codespell]
skip = '*.spm*,*.mplstyle,*.svg,nodestats.py,dnatracing.py'
count = ''
quiet-level = 3

topostats/tracing/dnatracing.py
topostats/tracing/nodestats.py

`pylint`

Currently the following files are explicitly excluded from being checked by pylint this will need reinstating and issues flagged addressing.

topostats/tracing/nodestats.py
topostats/tracing/dnatracing.py

The text was updated successfully, but these errors were encountered:

ns-rse · 2024-02-12T16:25:45Z

This will hold off any releases to PyPi in the meantime as the Software will be untested.

It will however mean people can work off of a single main branch installed from Git which will hold all the desired features.

Height-tracing and tests were merged with #790 this morning, closing the original issue #488.

MaxGamill-Sheffield · 2024-02-23T12:27:46Z

MaxGamill-Sheffield · 2024-02-23T13:09:56Z

Get advice from experimentalists

Happy with the state of what goes into the grainstats file and the .topostats files and it makes sense.
Distinction between grain and molecule seems clear.
Only thing they wanted to add was to understand how to load data from the .topostats file to plot.

llwiggins · 2024-03-01T13:20:45Z

Should U-net masking be included as part of this merger, or perhaps as a separate issue/PR?

MaxGamill-Sheffield · 2024-03-01T14:47:25Z

Should U-net masking be included as part of this merger, or perhaps as a separate issue/PR?

Great point! There are a few edits within the cats branch to improve the U-Net code (1 object from the segmentation and square crop edge-cases) but I think as it's quite separated from the rest of the code to be added, it might be more beneficial to start that as a separate branch and then merge it.

What do you think @SylviaWhittle?

ns-rse · 2024-05-21T12:52:16Z

Hi @MaxGamill-Sheffield,

I'm back on writing more tests for pruning and looking at the topostats.tracing.pruning.topostatsPrune() class. There is one method prune_all_skeletons() that I hadn't previously written tests for but now I look at it I've got some questions about it.

Question 1 - Going loopy

This method uses skimage.morphology.label() to label grains and then loops over them. This shouldn't be needed though since in #600 I refactored the tracing class to work with single grains and the control of tracing multiple grains from a single image is now done in the processing.run_dnatracing() function.

Is this looping a throwback to the older code from before the refactoring or is there a specific reason to loop over multiple grains? Currently I don't think it would do more than one loop since skimage.morphology.label() would label entangled grains as a single item.

Question 2 - Length v Height v both?

The logic in the prune_all_skeletons() isn't clear to me.

Currently it has...

    def prune_all_skeletons(self) -> npt.NDArray:
        """
        Prune all skeletons.
        Returns
        -------
        npt.NDArray
            A single mask with all pruned skeletons.
        """
        pruned_skeleton_mask = np.zeros_like(self.skeleton)
        labeled_skel = morphology.label(self.skeleton)
        for i in range(1, labeled_skel.max() + 1):
            single_skeleton = np.where(labeled_skel == i, 1, 0)
            if self.max_length is not None:
                single_skeleton = self._prune_by_length(single_skeleton, max_length=self.max_length)
            if self.height_threshold is not None:
                single_skeleton = heightPruning(
                    self.img,
                    single_skeleton,
                    height_threshold=self.height_threshold,
                    method_values=self.method_values,
                    method_outlier=self.method_outlier,
                ).remove_bridges()
            # skeletonise to remove nibs
            pruned_skeleton_mask += getSkeleton(self.img, single_skeleton, method="zhang").get_skeleton()
        return pruned_skeleton_mask

Within the for loop there are two if statements.

Prune single_skeleton if the self.max_length is not None. The value for this parameter set in the topostats/default_config.yaml of your branch is -1 (which triggers length based pruning based on 15% of grain length).
Prune single_skeleton if the self.height_threshold is not None. The value for this parameter set in the topostats/default_config.yaml of your branch is None.

This sounds ok, but what if a configuration such as...

dnatracing:
  run: true # Options : true, false
  ...
  pruning_params:
    pruning_method: topostats
    max_length: -1
    height_threshold: 789
    method_values: mid
    method_outlier: mean_abs

...were used by a user? Its not clear to the user, based on the docstrings as they stand what would happen.

I can say looking at the code that only the height-based pruning would be undertaken since the results of pruning based on length are not passed into the heightPruning() class, but is this what is expected?

Are there cases where both length and height based pruning should be applied?

Background

I realised this when starting to remove the for: loop and refactor and write out paramterised tests to check the effects of the different combinations of options for max_length / height_threshold / method_values / method_outlier as ideally tests should test the effects of different options but then I realised the lack of clarity about what behaviour is intended or required.

MaxGamill-Sheffield · 2024-05-21T17:25:43Z

Hey @ns-rse ,

A1: Yep just old code that I didn't have time to refactor, and I think it was designed similarly to the skeletonisation code which did the same thing as that's what the skimage methods did.

A2: This should absolutely take both the parameters from the config and performing both types of pruning on the skeleton, but concurrently (obvs there may be some issue with which comes first but this is sensible to me as they address different problems). However, maybe I'm misunderstanding but both pruning types update single_skeleton and use single_skeleton so I thought this would be working as intended and both will run under your config?

Also unless there are catastrophic things wrong with the code (e.g. if I've misunderstood here and both pruning stages can't happen together) I'm not sure it's worth your time refactoring the code so we can make the most of your time with us.

ns-rse · 2024-05-22T06:30:20Z

D'oh! 🤦 I clearly hadn't engaged my brain when I sat down and looked at the code yesterday as it will indeed apply both pruning types. I'm an idiot sometimes, thanks for pointing out my error so politely.

As I'm unlikely to look at this code again any time soon I will refactor after writing tests to remove the unnecessary loop, otherwise it will persist in the code base and need removing somewhen which may never happen. By writing a test first and then refactoring it will ensure I don't break anything.

Cheers,

@ns-rse

MaxGamill-Sheffield added the enhancement New feature or request label Feb 12, 2024

MaxGamill-Sheffield assigned SylviaWhittle, MaxGamill-Sheffield and llwiggins Feb 12, 2024

ns-rse mentioned this issue May 14, 2024

[tests] Consolidate fixtures from pruning and height profiles #843

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature] : Better Tracing & Skeletonisation Merger #800

[feature] : Better Tracing & Skeletonisation Merger #800

MaxGamill-Sheffield commented Feb 12, 2024 •

edited by ns-rse

ns-rse commented Feb 12, 2024

MaxGamill-Sheffield commented Feb 23, 2024 •

edited

MaxGamill-Sheffield commented Feb 23, 2024 •

edited

llwiggins commented Mar 1, 2024

MaxGamill-Sheffield commented Mar 1, 2024

ns-rse commented May 21, 2024

MaxGamill-Sheffield commented May 21, 2024

ns-rse commented May 22, 2024

[feature] : Better Tracing & Skeletonisation Merger #800

[feature] : Better Tracing & Skeletonisation Merger #800

Comments

MaxGamill-Sheffield commented Feb 12, 2024 • edited by ns-rse

Is your feature request related to a problem?

Describe the solution you would like.

Outputs

Skeletons

Tracing

Config

Code Tidy

pre-commit hooks

numpydoc-validation

codespell

pylint

ns-rse commented Feb 12, 2024

MaxGamill-Sheffield commented Feb 23, 2024 • edited

MaxGamill-Sheffield commented Feb 23, 2024 • edited

llwiggins commented Mar 1, 2024

MaxGamill-Sheffield commented Mar 1, 2024

ns-rse commented May 21, 2024

Question 1 - Going loopy

Question 2 - Length v Height v both?

Background

MaxGamill-Sheffield commented May 21, 2024

ns-rse commented May 22, 2024

MaxGamill-Sheffield commented Feb 12, 2024 •

edited by ns-rse

`pre-commit` hooks

`numpydoc-validation`

`codespell`

`pylint`

MaxGamill-Sheffield commented Feb 23, 2024 •

edited

MaxGamill-Sheffield commented Feb 23, 2024 •

edited