Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove transform module and update FCSData class and mef and compensate modules #344

Open
JS3xton opened this issue Nov 18, 2020 · 0 comments

Comments

@JS3xton
Copy link
Contributor

JS3xton commented Nov 18, 2020

After some reflection in #340, @castillohair and I agreed the transform module should be removed and code previously interfacing it should be simplified.

Tasks:

  • Remove transform module.
  • Modify FCSData to automatically transform data to RFI units.
  • Add FCSData.transform() function to support generic transformations.
  • Consolidate compensate.get_transform_fxn() and transform.to_compensated() and return all unmixed fluorescence signals.
  • Consolidate mef.get_transform_fxn() and transform.to_mef(). Update excel_ui accordingly.

Unresolved issues:

  • What form should the consolidated compensate function take?
    • What data structure(s) should be returned? (E.g., a numpy array, pd.DataFrame or FCSData object for each single-fluorophore control?)
    • Would a full_output flag (like with the gate module functions) provide better control over the amount of information returned?
    • Should the user be able to specify their own spillover matrix?
  • What form should the consolidated mef function take?

Relevant discussion from #340:

@castillohair:

I've become skeptical about the need to have a dedicated module for "transformations". I think this came out of our old view that it was worth distinguishing between "channel" units and "a.u.", and therefore having a module that transformed between these two. But having worked with a lot of flow cytometry data, including data from more modern instruments which are stored directly in a.u., I started seeing channel units as an intermediate step that should not be used for anything. If present-day me had to remake FlowCal from scratch, I'd probably have FCSData objects be directly converted to a.u. upon loading, and eliminate the transform module, since we never used it for anything other than the to_rfi() function. That way FCSData objects from old and new instruments will be automatically in a.u., improving consistency.

@JS3xton:

I've been reflecting on the transform module. Some thoughts:

  • It's always hard for me to remember how the MEF transformation traces its way through the mef and transform modules. I would be in favor of simplifying its derivation and exposure to the user.
  • I think I originally thought there were going to be a lot more transformations we would want to support (e.g., log, logicle, etc.). In practice, those have largely manifested themselves in the plot module.
  • The transform module is still useful for FCSData bookkeeping (e.g., making a copy of the FCSData object and updating FCSData.range()). I could envision this functionality being absorbed into FCSData, though (e.g., via a FCSData.transform() function).
  • The transform module is also still useful for applying transforms to non-FCSData data (e.g., a numpy array) in a standardized way. I don't know how many users use non-FCSData data, though. Moreover, if transform.to_mef() and transform.to_compensated() were moved back to their respective modules, those functions could still be written to support non-FCSData arrays.
  • I agree transform.to_rfi() might make more sense as an internal processing step of FCSData and doesn't really need to be exposed to the user as overtly as it currently is.
  • The mef module currently kind of bends over backwards to interface with the transform module (e.g., by producing a transformation function via mef.get_transform_fxn()). If that driving rationale is removed, the primary mef module interface point could possibly be simplified. I'm not sure what form it (and compensate) should take to simplify them, though. (Do we still return transformation functions? Do the modules provide functions that operate directly on data, like gate module functions?)

Upon reflection, I currently favor removing transform, updating FCSData, and simplifying the common mef and compensate module interfaces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant