Initial remarks

title

author

date

vignette

output

hmdbQuery: working with Human Metabolome Database (hmdb.ca)

Vincent J. Carey, stvjc at channing.harvard.edu

`r format(Sys.time(), '%B %d, %Y')`

%\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{hmdbQuery: working with Human Metabolome Database (hmdb.ca)}

html_document

highlight	number_sections	theme	toc
pygments	true	united	true

Initial remarks

The human metabolomics database (HMDB, http://www.hmdb.ca) includes XML documents describing 114000 metabolites. We will show how to manipulate the metadata on metabolites fairly flexibly.

suppressMessages({
suppressPackageStartupMessages({
library(hmdbQuery)
library(gwascat)
})
})

Key utilities of the package

The hmdbQuery package includes a function for querying HMDB directly over HTTP:

library(hmdbQuery)
lk1 = HmdbEntry(prefix = "http://www.hmdb.ca/metabolites/", 
       id = "HMDB0000001")

The result is parsed and encapsulated in an S4 object

lk1

The size of the complete import of information about a single metabolite suggests that it would not be too convenient to have comprehensive information about all HMDB constituents in memory. The most effective approach to managing the metadata will depend upon use cases to be developed over the long run.

Note however that this package does provide snapshots of certain direct associations derived from all available information as of Sept. 23 2017. Information about direct associations reported in the database is present in tables hmdb_disease, hmdb_gene, hmdb_protein, hmdb_omim. For example

data(hmdb_disease)
hmdb_disease

Working with the metadata

Disease associations

Some HMDB metabolites have been mapped to diseases.

d1 = diseases(lk1) # data.frame
pmids = unlist(d1["references", 5][[1]][2,])
library(annotate)
pm = pubmed(pmids[1])
ab = buildPubMedAbst(xmlRoot(pm)[[1]])
ab

Biospecimen and tissue location metadata

Note that pre HMDB v 4.0, biospecimens were called biofluids.

There are arbitrarily many biospecimen and tissue associations provided for each HMDB entry. We have direct accessors, and by default we capture all metadata, available through the store method.

biospecimens(lk1)
tissues(lk1)
st = store(lk1)
head(names(st))
length(names(st))
st$protein_assoc["name",]
st$protein_assoc["gene_name",]

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
R		R
data		data
inst/scripts		inst/scripts
man		man
tests		tests
vignettes		vignettes
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

R

R

data

data

inst/scripts

inst/scripts

man

man

tests

tests

vignettes

vignettes

DESCRIPTION

DESCRIPTION

NAMESPACE

NAMESPACE

README.md

README.md

Repository files navigation

Initial remarks

Key utilities of the package

Working with the metadata

Disease associations

Biospecimen and tissue location metadata

About

Releases

Packages

Contributors 4

Languages

vjcitn/hmdbQuery

Folders and files

Latest commit

History

Repository files navigation

Initial remarks

Key utilities of the package

Working with the metadata

Disease associations

Biospecimen and tissue location metadata

About

Resources

Stars

Watchers

Forks

Languages