Properly serialize an xcms result object to disk/files #693

jorainer · 2023-10-12T11:37:56Z

Instead of exporting/saving an xcms object in RData format it would be good to also support import/export from/to files in textual format.

Idea:

re-create an xcms result object from plain text files.
export the xcms result to text files.

Why? mostly for Galaxy-based workflows, to enable usage of the result objects across tools.

The text was updated successfully, but these errors were encountered:

jorainer · 2023-10-12T11:38:04Z

@hechth

sneumann · 2023-10-13T09:45:27Z

How much of that could be handled by mzTab-M ? The SMF Feature table nicely corresponds to the (grouped) xcms list. Ungrouped data is technically also kinda possible and requires lots of repetition. Metadata works as well. Bonus: we'd be able to read results from e.g. MS-Dial for further analysis.

Alternatively there are some functions by @PayamEmami in the MetaboIgniter workflow
https://github.com/nf-core/metaboigniter/blob/master/bin/consensusXMLToXcms.r
https://github.com/nf-core/metaboigniter/blob/master/bin/featureXMLToXcms.r

An earlier version was even implemented in Galaxy to integrate OpenMS and R based tools.

Main question would be if true round-tripping (export xcms object1 and re-import to object2 guaranteeing object1==object2) is a requirement.

Yours, Steffen

jorainer · 2023-10-18T08:12:56Z

Agree that it would be great to use standard file formats as much as possible! maybe we could export the feature data as a mzTab-M and the rest as other files (all in one folder). I would like to be able to export a full xcms result object to completely restore it again later (including process history, adjusted retention times, etc).

jorainer · 2023-10-18T08:13:37Z

@sneumann , do you have some definition somewhere how the mzTab-M format should actually look like?

sneumann · 2023-10-18T09:00:32Z

Hi, the official docs are in
https://hupo-psi.github.io/mzTab/2_0-metabolomics-release/mzTab_format_specification_2_0-M_release.html
and @nilshoffmann wrote an R package to read/write mzTab-M at https://github.com/lifs-tools/rmzTab-m/
I need to find some snippets I have to export the old xcms stuff which could become a starting point,
and we'd need to figure how to do the dependencies. Does that sound like an MsBackendMzTabM ?
Yours,
Steffen

jorainer · 2023-10-18T12:08:54Z

Had a quick look at the rmzTab-m - but am not particularly happy with the R6 objects. Would want to avoid dependency of xcms on such objects as much as possible.

sneumann · 2023-10-18T12:15:04Z

+1 for not having all that as dependency.
Still, there could be an XcmsExperiment-BackendMzTabM in the mid-term,
and in the short-term we have some code snippets that demonstrate import / export of (r)mzTab-M
as started in https://gist.github.com/sneumann/0f0d22027eda4db8ab28175de06b77f2
Yours, Steffen

sneumann · 2023-10-18T13:00:59Z

+1 for not adding these dependencies.
That's why I suggested a kind of XcmsExprtiment-BackendMzTabM in the mid-term.
rmzTab-M was initially generated through code generation from an OpenAPI specification. Maybe someone someday writes a code generator without using R6.
Until then we can work on snippets to demonstrate the export, such as this one:
https://gist.github.com/sneumann/0f0d22027eda4db8ab28175de06b77f2
Yours, Steffen

hechth · 2023-10-21T12:40:15Z

Hi - I agree that it would not need to be in the XCMS package to not further increase dependencies - also I think the XCMS object is quite stable, so not a lot of API breaking changes expected I think.

I agree that exporting to mzTab-m and importing would also be an option - then the question is, how well does mzTab-M work for peaks? I thought always it was more designed to capture the final output with compound concentrations etc.?

jorainer · 2023-10-24T06:25:34Z

I would then suggest to maybe define a method

storeResults (or exportResults): that takes a xcms result object and saves that in some format.
restoreResults: that allows to restore the result object from the exported file(s).

implementations of these methods could then be defined for different param objects:

RDataParam: simply export in RData format (same as save/load).
PlainTextParam: export data as files in (custom) plain text format.
MzTabMParam: export data in mzTab-M format.

The first two could be implemented in xcms as they don't add any dependencies. The last one could be implemented in a separate package.

While it's not exactly what we have with the backend concept in Spectra it is still similar. Would that be a workable solution @sneumann ? With that we would add proper support for mzTab-M and in addition would support @hechth's request to export/import to/from text files which would make it easier for Galaxy integration (if I understood it correctly).

nilshoffmann · 2023-10-24T06:45:47Z

Hi - I agree that it would not need to be in the XCMS package to not further increase dependencies - also I think the XCMS object is quite stable, so not a lot of API breaking changes expected I think.

I agree that exporting to mzTab-m and importing would also be an option - then the question is, how well does mzTab-M work for peaks? I thought always it was more designed to capture the final output with compound concentrations etc.?

mzTab-M, in contrast to mzTab 1.0, has the feature table, which allows you to report start and end time of aligned "m/z features", area etc. The output of final compound concentrations (or whatever quantity you define to report) is reported in the summary table. The evidence table further allows you to report identification evidence for your features.
Also, anything that is missing in a particular table can be added as an optional column.

nilshoffmann · 2023-10-24T06:51:43Z

+1 for not adding these dependencies. That's why I suggested a kind of XcmsExprtiment-BackendMzTabM in the mid-term. rmzTab-M was initially generated through code generation from an OpenAPI specification. Maybe someone someday writes a code generator without using R6. Until then we can work on snippets to demonstrate the export, such as this one: https://gist.github.com/sneumann/0f0d22027eda4db8ab28175de06b77f2 Yours, Steffen

Additionally, there are low level IO functions available in the package already for reading:
https://github.com/lifs-tools/rmzTab-m/blob/master/R/read_mz_tab.R

The writing part is currently tied to R6, as Steffen said, due to the fact that I autogenerated the models and REST API implementation for the online validator webservice based on the OpenApi specification.

https://github.com/lifs-tools/rmzTab-m/blob/master/R/write_mz_tab.R

However under the hood, most objects are converted to and from tables in some sort of way, also using jsonlite a lot. So mapping this or reimplementing it with another object model, @jorainer which one would you prefer?

nilshoffmann · 2023-10-24T06:53:17Z

Each object already has toDataFrame and fromDataFrame methods available, see the Instrument class as an example:

https://github.com/lifs-tools/rmzTab-m/blob/master/R/instrument.R#L173
https://github.com/lifs-tools/rmzTab-m/blob/master/R/instrument.R#L218

nilshoffmann · 2023-10-24T07:09:28Z

So converting this to S4 reference classes, or what alternative class type would you recommend, should be doable?

jorainer · 2023-10-24T07:16:23Z

S4 would be ideal - we could however also use the R6 implementation and hide all the R6 stuff from the user through the functions (e.g. the storeResults method) above, so that should also be doable.

jorainer · 2023-10-24T08:40:02Z

Maybe @philouail can have a look into this. I'm assigning the issue to her.

hechth · 2023-10-25T09:42:59Z

That sounds great, thanks for getting started on this! Feel free to reach out if you have any more questions or you need some example data for the plain text export!

hechth · 2023-10-25T09:43:51Z

@jorainer do you think it would also be possible to create an import from mzTab?

hechth · 2023-10-25T09:45:42Z

also xref lifs-tools/rmzTab-m#3

jorainer · 2023-10-25T10:14:24Z

yes, sure @hechth , we will also work on an mzTab-M importer - but let's first get started with the "simpler" text export/imports.

hechth · 2023-10-25T10:36:16Z

Cool - let me know if I can help on this. The mzTab-M importer would be great to import results from OpenMS peak picking which exports as mzTab - so then we could build an XCMS object and then for example run it through RAMClustR etc.

philouail self-assigned this Oct 25, 2023

philouail mentioned this issue Oct 31, 2023

feat: creation of new generic function so save XcmsExperiment object #698

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Properly serialize an xcms result object to disk/files #693

Properly serialize an xcms result object to disk/files #693

jorainer commented Oct 12, 2023

jorainer commented Oct 12, 2023

sneumann commented Oct 13, 2023

jorainer commented Oct 18, 2023

jorainer commented Oct 18, 2023

sneumann commented Oct 18, 2023

jorainer commented Oct 18, 2023

sneumann commented Oct 18, 2023 •

edited

sneumann commented Oct 18, 2023

hechth commented Oct 21, 2023

jorainer commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

jorainer commented Oct 24, 2023

jorainer commented Oct 24, 2023

hechth commented Oct 25, 2023

hechth commented Oct 25, 2023

hechth commented Oct 25, 2023

jorainer commented Oct 25, 2023

hechth commented Oct 25, 2023

Properly serialize an xcms result object to disk/files #693

Properly serialize an xcms result object to disk/files #693

Comments

jorainer commented Oct 12, 2023

jorainer commented Oct 12, 2023

sneumann commented Oct 13, 2023

jorainer commented Oct 18, 2023

jorainer commented Oct 18, 2023

sneumann commented Oct 18, 2023

jorainer commented Oct 18, 2023

sneumann commented Oct 18, 2023 • edited

sneumann commented Oct 18, 2023

hechth commented Oct 21, 2023

jorainer commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

nilshoffmann commented Oct 24, 2023

jorainer commented Oct 24, 2023

jorainer commented Oct 24, 2023

hechth commented Oct 25, 2023

hechth commented Oct 25, 2023

hechth commented Oct 25, 2023

jorainer commented Oct 25, 2023

hechth commented Oct 25, 2023

sneumann commented Oct 18, 2023 •

edited