Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best way to document additional XCMS object processing steps in the ProcessHistory slot? #735

Open
wkumler opened this issue Mar 28, 2024 · 3 comments

Comments

@wkumler
Copy link
Contributor

wkumler commented Mar 28, 2024

Hi again,
I'm writing a package that extends the default XCMS functionality with the option to return an XCMSnExp (and soon, hopefully, an XcmsExperiment object!) where low-quality features have been removed. I'd like to document this processing step in the history, but can't figure out a good way to implement it from my side alone because it looks like the addProcessHistory function isn't exported, along with the necessary ProcessHistory() and XProcessHistory() class constructors. It also looks like the processing type is restricted to a small set of possible "types", none of which my addition fits into nicely. Was this an intentional design decision or are you open to expanding this class?

Right now I'm adding my processing step as a list with components meant to mimic the layout of the existing XProcessHistory objects but it would be great to do this "officially".

xcms_obj@.processHistory[[length(xcms_obj@.processHistory)+1]] <- list(
  type="Low-quality feature removal via squallms",
  date=date(),
  info="Nonstandard processing step",
  fileIndex=seq_along(fileNames(xcms_filled)),
  `Parameter class`="none",
  `MS level(s)`=1
)
@wkumler
Copy link
Contributor Author

wkumler commented Mar 29, 2024

Update: I no longer like the list option I've described above because then printing the object throws an error when there's no processType method for a list :(

@jorainer
Copy link
Collaborator

jorainer commented Apr 3, 2024

Hi! well, indeed it would make sense to open this up. Originally, this was only used internally to document/log the xcms processing steps.

so, the processHistory is a list of ProcessHistory classes, defined here: https://github.com/sneumann/xcms/blob/devel/R/DataClasses.R#L232-L269 . The supported type for process history are these here: https://github.com/sneumann/xcms/blob/devel/R/DataClasses.R#L183-L204

There is also an extension to the ProcessHistory, the XProcessHistory class (https://github.com/sneumann/xcms/blob/devel/R/DataClasses.R#L340-L363), that allows also to add a Param class (which is defined in ProtGenerics) with parameters of the processing. All parameter classes in xcms extend this base class and can thus be added.

For both classes, there is a constructor function (xcms:::ProcessHistory() or xcms:::XProcessHistory()) that should be used to create an instance of the class.

Up to now it was only me using these classes - but I'm OK to opening up and export all of it. Is there something you would need/like (e.g. also a new data processing type)?

@wkumler
Copy link
Contributor Author

wkumler commented Apr 3, 2024

Yep! I think I've got a pretty clear sense of how to interact with the objects, I just can't embed it into a package of my own because the functions aren't exported. I guess the question is also a more general one - the idea behind the ProcessHistory is to make the XCMS objects more reproducible by documenting the steps involved. I don't have a parameter that's being supplied, however. The goal is to make peak quality detection interactive so all that would be documented are the features removed for being low-quality. Does this mesh with the philosophy behind the ProcessHistory? I agree that it's rare to see it used for debugging or reproducibility purposes because the XCMS object is so rarely stored somewhere instead of just the raw code / data files. I like the idea behind it and want to support it, but if it's going to be a pain and nobody else uses it then probably best to spend time elsewhere.

The code I'd like to include is below:

  ph <- xcms:::ProcessHistory(date. = date(),
                              type. = "Low-quality feature removal via squallms",
                              fileIndex = 1:length(fileNames(xcms_filled)))
  xph <- xcms:::XProcessHistory(param = NULL, date. = date(),
                               type. = "Unknown",
                               fileIndex = 1:length(fileNames(xcms_filled)),
                               msLevel = 1)
  xcms:::addProcessHistory(xcms_filled, ph) %>%
    processHistory()

As you note, the data processing type isn't currently supported so an error's thrown upon attempting to define the ph object:

Error in validObject(.Object) : 
  invalid class “ProcessHistory” object: Got invalid type 'Low-quality feature removal via squallms'! Allowd are: "Unknown", "Peak detection", "Peak refinement", "Peak grouping", "Retention time correction", "Missing peak filling", "Calibration", "Feature grouping"

I'm not sure how much of a hassle this is to add. Looks like there might need to be specific print methods defined for each type? Happy to help with that if you can point me to the method definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants