biome-distribution/src at master · linas/biome-distribution

History

Name		Name	Last commit message	Last commit date
parent directory ..
syntax		syntax
README.md		README.md
annotate-all.scm		annotate-all.scm
bio-edge.scm		bio-edge.scm
bio-loop.scm		bio-loop.scm
bio-pentagon.scm		bio-pentagon.scm
bio-tetra.scm		bio-tetra.scm
cleanup-data.scm		cleanup-data.scm
data-export.scm		data-export.scm
debug-wrapper.scm		debug-wrapper.scm
gene-pairs.scm		gene-pairs.scm
grid.conf		grid.conf
lmpd-genes.scm		lmpd-genes.scm
load-files.scm		load-files.scm
mem-use.sh		mem-use.sh
path.conf		path.conf
performance-instrumentation.scm		performance-instrumentation.scm
stats.scm		stats.scm

README.md

Data processing

Some of the primary files:

load-files.scm - Load genomic/proteomic datasets into the AtomSpace
cleanup-data.scm - Scrub datasets for cleaner results.
bio-loop.scm - Data-mine triangular and pentagonal interactions.
bio-tetra.scm - Data-mine tetrahedron interactions.
data-export.scm - Export selected stats to CSV files for graphing.
gene-pairs.scm - Provide a sparse-matrix API for data analysis.

The primary data-processing pipeline is to start up a guile shell, and then load each of the above files in turn. Review thier contents, and perform the "obvious" operations that each of them suggest. See also the graphs directory for assorted data-graphing scripts.

Some of the secondary files:

performance-instrumentation.scm - measure performance stats.
debug-wrapper.scm - Used for debugging the MOZI code-base.
lmpd-genes.scm - a list of 1482 genes.
mem-use.sh - log RAM usage statistics.
grid.conf and path.conf - handy-dandy cogserver configs

Typical run

For info on typical data runs, see the files:

graphs/loop-graphs/README.md
graphs/pentagon-paths/README.md
graphs/tetrahedra/README.md

Files

src

Directory actions

More options

Directory actions

More options

Latest commit

History

src

Folders and files

parent directory

Data processing

Typical run