Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move creation of tracing dictionary saved to HDF5 from processing to dnatracing #803

Open
ns-rse opened this issue Feb 20, 2024 · 2 comments
Labels
DNATracing Issues pertaining to the DNATracing class

Comments

@ns-rse
Copy link
Collaborator

ns-rse commented Feb 20, 2024

Currently the dictionary of data that relates to the results of processing is cobbled together from the objects that are returned by topostats.dnatracing.trace_image() in the procesing.dnatracing() function (see here.

topostats.dnatracing.trace_image() is itself returning a dictionary, but only a subset of these are being saved to HDF5.

Is everything needed if its not being saved to HDF5?

If not then its likely that we can simplify what is returned by topostats.dnatracing.trace_image() and assign it directly to the grain_trace_data[direction] dictionary.

Line 411 would then read...

            grain_trace_data[direction] = trace_image(
                    image=image,
                    grains_mask=grain_masks[direction],
                    filename=filename,
                    pixel_to_nm_scaling=pixel_to_nm_scaling,
                    **dnatracing_config,
                )

...and topostats.dnatracing would need modifying so that only the following are returned as a dictionary...

all_ordered_traces
all_cropped_images
all_ordered_trace_heights
all_ordered_trace_cumulative_heights
all_splined_traces

The statistics component of the dictionary currently returned by trace_image() should be returned as a separate object as it is not saved to HDF5, rather it is saved to the tracing_stats[direction] dictionary for combining with the grain statistics and writing to CSV.

@ns-rse ns-rse added the DNATracing Issues pertaining to the DNATracing class label Feb 20, 2024
@MaxGamill-Sheffield
Copy link
Collaborator

Want me to try and address this during the nodestats tracing merger, or keep to the small commits in which it might be best to wait until after? It seems pretty big already.

And for clarity are you expecting 1 object for each thing - i.e. return (to_be_merged_with_grainstats_df, for_hdf5_dict, etc...)?

@ns-rse
Copy link
Collaborator Author

ns-rse commented Feb 23, 2024

Small atomic commits are my preference.

  • Its easier to review.
  • The Git history remains cleaner and easier to work out where and when changes are made and by whom.

Looking at what is done in processing.dnatracing() it constructs a dictionary of components that is saved to HDF5.

I'm proposing returning that dictionary by topostats.dnatracing.trace_image() and so yes instead of...

    return {
        "statistics": results,
        "all_ordered_traces": ordered_traces,
        "all_splined_traces": splined_traces,
        "all_cropped_images": cropped_images,
        "image_ordered_trace": image_trace,
        "image_spline_trace": image_spline_trace,
        "all_ordered_trace_heights": all_ordered_trace_heights,
        "all_ordered_trace_cumulative_distances": all_ordered_trace_cumulative_distances,
    }

...it would look like this...

    return (results, {
        "all_ordered_traces": ordered_traces,
        "all_splined_traces": splined_traces,
        ...
                      })

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DNATracing Issues pertaining to the DNATracing class
Projects
None yet
Development

No branches or pull requests

2 participants