Skip to content

Commit

Permalink
Merge branch 'dev'
Browse files Browse the repository at this point in the history
  • Loading branch information
itsrainingdata committed Sep 12, 2017
2 parents dd522d9 + 0680d23 commit 4caea3e
Show file tree
Hide file tree
Showing 18 changed files with 326 additions and 56 deletions.
17 changes: 9 additions & 8 deletions DESCRIPTION
@@ -1,26 +1,27 @@
Package: sparsebn
Title: Learning Sparse Bayesian Networks from High-Dimensional Data
Version: 0.0.5
Date: 2017-03-15
Authors@R: person("Bryon", "Aragam", email = "sparsebn@gmail.com", role = c("aut", "cre"))
Date: 2017-09-11
Authors@R: c(
person("Bryon", "Aragam", email = "sparsebn@gmail.com", role = c("aut", "cre")),
person("Jiaying", "Gu", role = c("aut")),
person("Dacheng", "Zhang", role = c("aut")),
person("Qing", "Zhou", role = c("aut"))
)
Maintainer: Bryon Aragam <sparsebn@gmail.com>
Description: Fast methods for learning sparse Bayesian networks from high-dimensional data using sparse regularization, as described in Aragam, Gu, and Zhou (2017) <https://arxiv.org/abs/1703.04025>. Designed to handle mixed experimental and observational data with thousands of variables with either continuous or discrete observations.
Description: Fast methods for learning sparse Bayesian networks from high-dimensional data using sparse regularization, as described in Aragam, Gu, and Zhou (2017) <arXiv:1703.04025>. Designed to handle mixed experimental and observational data with thousands of variables with either continuous or discrete observations.
Depends:
R (>= 3.2.3),
sparsebnUtils (>= 0.0.5),
ccdrAlgorithm (>= 0.0.4),
discretecdAlgorithm (>= 0.0.3)
discretecdAlgorithm (>= 0.0.5)
Suggests:
knitr,
rmarkdown,
mvtnorm,
igraph,
graph,
testthat
Remotes:
itsrainingdata/sparsebnUtils@dev,
itsrainingdata/ccdrAlgorithm@dev,
itsrainingdata/discretecdAlgorithm
URL: https://github.com/itsrainingdata/sparsebn
BugReports: https://github.com/itsrainingdata/sparsebn/issues
License: GPL (>= 2)
Expand Down
9 changes: 9 additions & 0 deletions NEWS.md
@@ -1,3 +1,12 @@
# sparsebn 0.0.5

## Features

* `estimate.dag` now supports white lists and black lists (#6)
* Cytoscape compatibility now available via the method `openCytoscape` (#4)
* `plotDAG` now includes labels for each subplot by default
* See NEWS files for sparsebnUtils, discretecdAlgorithm, and ccdrAlgorithm for more updates

# sparsebn 0.0.4

## Notes
Expand Down
56 changes: 40 additions & 16 deletions R/data.R
@@ -1,10 +1,13 @@
#' The pathfinder network
#'
#' Simulated data and network for the pathfinder network from the \href{http://www.bnlearn.com/bnrepository/#pathfinder}{Bayesian network repository}.
#' Pathfinder is an expert system developed by Heckerman et. al (1992)
#' Simulated data and network for the pathfinder network from the
#' \href{http://www.bnlearn.com/bnrepository/#pathfinder}{Bayesian network repository}.
#' Pathfinder is an expert system developed by
#' \href{http://heckerman.com/david/HN92cbr.pdf}{Heckerman et. al (1992)} [1]
#' to assist with the diagnosis of lymph-node diseases.
#'
#' The data is simulated from a Gaussian SEM assuming unit edge weights and
#' This is a benchmark network used to test algorithms for learning Bayesian
#' networks. The data is simulated from a Gaussian SEM assuming unit edge weights and
#' unit variances for all nodes.
#'
#' @format A \code{\link{list}} with four components:
Expand All @@ -22,28 +25,36 @@
#' @usage
#' data(pathfinder)
#'
#' @references
#' [1] Heckerman, David E., and Bharat N. Nathwani. "\href{http://heckerman.com/david/HN92cbr.pdf}{An evaluation of the diagnostic accuracy of Pathfinder}." Computers and Biomedical Research 25.1 (1992): 56-74.
#'
#' @examples
#' \dontrun{
#' # Create a valid sparsebnData object from the simulated pathfinder data
#'
#' ### Create a valid sparsebnData object from the simulated pathfinder data
#' data(pathfinder)
#' dat <- sparsebnData(pathfinder$data, type = "c")
#'
#' # If desired, change the edge weights to be randomly generated
#' coefs <- get.adjacency.matrix(pathfinder$dag)
#' coefs[coefs != 0] <- runif(n = num.edges(pathfinderDAG), min = 0.5, max = 2)
#' vars <- Matrix::Diagonal(n = num.nodes(pathfinderDAG), x = rep(1, num.nodes(pathfinderDAG)))
#' id <- vars
#' covMat <- t(solve(id - coefs)) %*% vars %*% solve(id - coefs)
#' pathfinder.data <- rmvnorm(n = 1000, sigma = as.matrix(covMat))
#' }
#' ### Code to reproduce this dataset by randomly generating edge weights
#' coefs <- runif(n = num.edges(pathfinder$dag), min = 0.5, max = 2) # coefficients
#' vars <- rep(1, num.nodes(pathfinder$dag)) # variances
#' params <- c(coefs, vars) # parameter vector
#' pathfinder.data <- generate_mvn_data(graph = pathfinder$dag,
#' params = params,
#' n = 1000)
#'
"pathfinder"

#' The discrete cytometry network
#'
#' Data and network for analyzing the flow cytometry experiment
#' from \href{http://science.sciencemag.org/content/308/5721/523.long}{Sachs et al. (2005)}.
#' The data is a cleaned and discretized version of the raw data (see \code{\link{cytometryContinuous}}) from these experiments.
#' from \href{http://science.sciencemag.org/content/308/5721/523.long}{Sachs et al. (2005)} [1].
#' The data is a cleaned and discretized version of the raw data (see
#' \code{\link{cytometryContinuous}} for details) from these experiments.
#'
#' After cleaning and pre-processing, the raw continuous measurements have been
#' binned into one of three levels: low = 0, medium = 1, or high = 2. Due to the
#' pre-processing, the discrete data contains fewer observations (n = 5400)
#' compared to the raw, continuous data.
#'
#' @format A \code{\link{list}} with three components:
#'
Expand All @@ -57,6 +68,9 @@
#' @usage
#' data(cytometryDiscrete)
#'
#' @references
#' [1] Sachs, Karen, et al. "\href{http://science.sciencemag.org/content/308/5721/523.long}{Causal protein-signaling networks derived from multiparameter single-cell data}." Science 308.5721 (2005): 523-529.
#'
#' @examples
#' # Create a valid sparsebnData object from the cytometry data
#' data(cytometryDiscrete)
Expand All @@ -67,9 +81,16 @@
#' The continuous cytometry network
#'
#' Data and network for analyzing the flow cytometry experiment
#' from \href{http://science.sciencemag.org/content/308/5721/523.long}{Sachs et al. (2005)}.
#' from \href{http://science.sciencemag.org/content/308/5721/523.long}{Sachs et al. (2005)} [1].
#' This dataset contains the raw measurements from these experiments.
#'
#' The dataset consists of n = 7466 observations of p = 11 continuous
#' variables corresponding to different proteins and phospholipids in human
#' immune system cells, and each observation indicates the measured level of
#' each biomolecule in a single cell under different experimental interventions.
#' Based on this data, a consensus network was reconstructed and validated, which
#' is included as well.
#'
#' @format A \code{\link{list}} with three components:
#'
#' \itemize{
Expand All @@ -82,6 +103,9 @@
#' @usage
#' data(cytometryContinuous)
#'
#' @references
#' [1] Sachs, Karen, et al. "\href{http://science.sciencemag.org/content/308/5721/523.long}{Causal protein-signaling networks derived from multiparameter single-cell data}." Science 308.5721 (2005): 523-529.
#'
#' @examples
#' # Create a valid sparsebnData object from the cytometry data
#' data(cytometryContinuous)
Expand Down
23 changes: 23 additions & 0 deletions R/sparsebn-main.R
Expand Up @@ -29,6 +29,14 @@
#' used based on a decreasing log-scale (see also \link[sparsebnUtils]{generate.lambdas}).
#' @param lambdas.length Integer number of values to include in the solution path. If \code{lambdas}
#' has also been specified, this value will be ignored.
#' @param whitelist A two-column matrix of edges that are guaranteed to be in each
#' estimate (a "white list"). Each row in this matrix corresponds
#' to an edge that is to be whitelisted. These edges can be
#' specified by node name (as a \code{character} matrix), or by
#' index (as a \code{numeric} matrix).
#' @param blacklist A two-column matrix of edges that are guaranteed to be absent
#' from each estimate (a "black list"). See argument
#' "\code{whitelist}" above for more details.
#' @param error.tol Error tolerance for the algorithm, used to test for convergence.
#' @param max.iters Maximum number of iterations for each internal sweep.
#' @param edge.threshold Threshold parameter used to terminate the algorithm whenever the number of edges in the
Expand Down Expand Up @@ -56,6 +64,8 @@
estimate.dag <- function(data,
lambdas = NULL,
lambdas.length = 20,
whitelist = NULL,
blacklist = NULL,
error.tol = 1e-4,
max.iters = NULL,
edge.threshold = NULL,
Expand Down Expand Up @@ -89,11 +99,22 @@ estimate.dag <- function(data,
### Is the data gaussian, binomial, or multinomial? (Other data not supported yet.)
data_family <- sparsebnUtils::pick_family(data)

### If intervention list contains character names, convert to indices
if("character" %in% sparsebnUtils::list_classes(data$ivn)){
data$ivn <- lapply(data$ivn, function(x){
idx <- match(x, names(data$data))
if(length(idx) == 0) NULL # return NULL if no match (=> observational)
else idx
})
}

### Run the main algorithms
if(data_family == "gaussian"){
ccdrAlgorithm::ccdr.run(data = data,
lambdas = lambdas,
lambdas.length = lambdas.length,
whitelist = whitelist,
blacklist = blacklist,
gamma = concavity,
error.tol = error.tol,
max.iters = max.iters,
Expand All @@ -104,6 +125,8 @@ estimate.dag <- function(data,
discretecdAlgorithm::cd.run(indata = data,
lambdas = lambdas,
lambdas.length = lambdas.length,
whitelist = whitelist,
blacklist = blacklist,
error.tol = error.tol,
convLb = convLb,
weight.scale = weight.scale,
Expand Down
1 change: 1 addition & 0 deletions R/sparsebn-plotting.R
Expand Up @@ -72,6 +72,7 @@ plotDAG.sparsebnPath <- function(x, ...){
#

plot(x,
labels = TRUE,
vertex.size = 4,
vertex.label = NA,
vertex.label.color = gray(0),
Expand Down
1 change: 1 addition & 0 deletions README.Rmd
Expand Up @@ -14,6 +14,7 @@ knitr::opts_chunk$set(

# sparsebn

[![Project Status: Active The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![Travis-CI Build Status](https://travis-ci.org/itsrainingdata/sparsebn.svg?branch=master)](https://travis-ci.org/itsrainingdata/sparsebn)
[![](http://www.r-pkg.org/badges/version/sparsebn)](http://www.r-pkg.org/pkg/sparsebn)
[![CRAN RStudio mirror downloads](http://cranlogs.r-pkg.org/badges/sparsebn)](http://www.r-pkg.org/pkg/sparsebn)
Expand Down
2 changes: 1 addition & 1 deletion README.md
@@ -1,7 +1,7 @@
sparsebn
========

[![Travis-CI Build Status](https://travis-ci.org/itsrainingdata/sparsebn.svg?branch=master)](https://travis-ci.org/itsrainingdata/sparsebn) [![](http://www.r-pkg.org/badges/version/sparsebn)](http://www.r-pkg.org/pkg/sparsebn) [![CRAN RStudio mirror downloads](http://cranlogs.r-pkg.org/badges/sparsebn)](http://www.r-pkg.org/pkg/sparsebn)
[![Project Status: Active The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active) [![Travis-CI Build Status](https://travis-ci.org/itsrainingdata/sparsebn.svg?branch=master)](https://travis-ci.org/itsrainingdata/sparsebn) [![](http://www.r-pkg.org/badges/version/sparsebn)](http://www.r-pkg.org/pkg/sparsebn) [![CRAN RStudio mirror downloads](http://cranlogs.r-pkg.org/badges/sparsebn)](http://www.r-pkg.org/pkg/sparsebn)

Introducing `sparsebn`: A new R package for learning sparse Bayesian networks and other graphical models from high-dimensional data via sparse regularization. Designed from the ground up to handle:

Expand Down
23 changes: 13 additions & 10 deletions cran-comments.md
@@ -1,20 +1,23 @@
## Test environments
* local OS X install, R 3.3.3
* ubuntu 12.04.5 (travis-ci), R 3.3.3 (oldrel, devel, and release)
* local OS X install, R 3.4.1
* ubuntu 12.04.5 (travis-ci: oldrel, devel, and release)
* win-builder (devel and release)
* r-hub (oldrel, devel, and release)
* r-hub (devel)

## R CMD check results
There were no ERRORs or WARNINGs. There was one NOTE:
There were no ERRORs, WARNINGs, or NOTEs.

* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Bryon Aragam <sparsebn@gmail.com>
## CRAN Package Check Results for Package sparsebn

Days since last update: 4
From https://cran.rstudio.com/web/checks/check_results_sparsebn.html

- This is a very minor release to update the package metadata to include a reference
to a new preprint discussing this package. No DOI is available yet since it is currently
under review.
Version: 0.0.4
Check: package dependencies
Result: NOTE
Package suggested but not available for checking: ‘graph’
Flavor: r-release-osx-x86_64

This has been fixed.

## Dependencies

Expand Down
Binary file modified data/cytometryContinuous.rda
Binary file not shown.
Binary file modified data/cytometryDiscrete.rda
Binary file not shown.
13 changes: 12 additions & 1 deletion man/cytometryContinuous.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 12 additions & 2 deletions man/cytometryDiscrete.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 14 additions & 4 deletions man/estimate.dag.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

31 changes: 18 additions & 13 deletions man/pathfinder.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 4caea3e

Please sign in to comment.