mzdb-specs

Specifications of the mzDB file format.

The main document describing the mzDB specifications is named "mzDB_x.x.x_specifications.docx", "x.x.x" being the version number of the corresponding specifications. This document is also provided in the PDF format. For each version of the specifications, a dedicated folder has been created.

This folder contains some other documents for developers who prefer to have access to database diagram or script:

mzDB_x.x.x_schema_diagram.architect => the PowerArchitect diagram of the SQLite schema. The PowerArchitect software can be downloaded from its Google Code page: https://code.google.com/p/power-architect/
mzDB_x.x.x_schema_documentation.html => the HTML documentation of the schema (as exported from PowerArchitect)
mzDB_x.x.x_schema_picture.pdf => the picture of the schema in PDF format (as exported from PowerArchitect)
mzDB_x.x.x_schema_script.sql => the SQLite script used for mzDB schema initialization

Benchmarks and use cases

This work and has been published in the MCP journal:

Bouyssie D, Dubois M, Nasso S, Gonzalez de Peredo A, Burlet-Schiltz O, Aebersold R, Monsarrat B. mzDB: a file format using multiple indexing strategies for the efficient analysis of large LC-MS/MS and SWATH-MS datasets. Mol Cell Proteomics. 2014 Dec 11

In this paper, the performance of mzDB is benchmarked in several reading modes against the mzML and mz5 file formats.

Roadmap

Current version of the mzDB model is 0.7. Next version (0.8) will include modifications related to the storage of peaks in the bounding boxes. The goals are:

optimize the data loading of peaks from a given bounding box
add support for the MS-Numpress compression algorithms

This version may also introduce the concept of acquisition channels. Eeach channel represents a collection of MS data acquired with the same acquisition settings during the run. It should add support for hybrid DDA et DIA acquisitions.

CREATE TABLE acquisition_channel (
id INTEGER PRIMARY KEY AUTOINCREMENT,
ms_level INTEGER NOT NULL, // 1, 2, 3...
acquisition_mode TEXT NOT NULL, // DDA/DIA
activation_type TEXT NOT NULL, // CID, ETD, HCD...
min_mz REAL NOT NULL, // min m/z of this channel
max_mz REAL NOT NULL, // max m/z of this channel
min_parent_mz REAL, // DIA only and ms_level > 1
max_parent_mz REAL, // DIA only and ms_level > 1
param_tree TEXT
)

Each spectrum and each spectrum will have a FK corresponding to a given acquisition channel.

Changelog

Version 0.7

Some changes in the data_encoding table. The representation of number precisions for m/z and intensity values was ambiguous, because the previous one required to write two consecutive CvTerms in a proper order (m/z value CvTerm then itensity value one). However it was not really clear for somebody who doesn't know this convention.

We have thus chosen to update the structure of this table by adding two new columns:

mz_precision => could be 32|64
intensity_precision => could be 32|64

Left and right HWHMs precision is always 32bits.

The param_tree column is kept but is NULL by default.

Version 0.6

Table bounding_box_msn_rtree added to support indexing of DIA datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
example_data		example_data
version_0.5.0		version_0.5.0
version_0.6.0		version_0.6.0
version_0.7.0		version_0.7.0
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

example_data

example_data

version_0.5.0

version_0.5.0

version_0.6.0

version_0.6.0

version_0.7.0

version_0.7.0

README.md

README.md

Repository files navigation

mzdb-specs

Benchmarks and use cases

Roadmap

Changelog

Version 0.7

Version 0.6

About

Releases

Packages

Languages

mzdb/mzdb-specs

Folders and files

Latest commit

History

Repository files navigation

mzdb-specs

Benchmarks and use cases

Roadmap

Changelog

Version 0.7

Version 0.6

About

Resources

Stars

Watchers

Forks

Languages