Skip to content

A GLUE project for comparative genomic analysis of circular Rep-encoding single-stranded DNA (CRESS DNA) viruses

Notifications You must be signed in to change notification settings

giffordlabcvr/CRESS-GLUE

Repository files navigation

CRESS-GLUE: Phylogenomic Analysis of CRESS DNA viruses

Overview

Welcome to CRESS-GLUE, a sequence-oriented resource for comparative genomic analysis of circular Rep-encoding single-stranded DNA (CRESS DNA) viruses (phylum Cressdnaviricota), developed using the GLUE software framework.

GLUE is an open, integrated software toolkit designed for storing and interpreting sequence data. It supports the creation of bespoke projects, incorporating essential data items for comparative genomic analysis, such as sequences, multiple sequence alignments, genome feature annotations, and other associated data.

Projects are loaded into the GLUE "engine," forming a relational database that represents the semantic relationships between data items. This foundation supports systematic comparative analyses and the development of sequence-based resources.

CRESS-GLUE contains CRESS feature definitions, alignments, and reference sequences for all CRESS virus species.

This CRESS-GLUE project can be extended with additional layers, openly available via GitHub, including:

  • CRESS-GLUE-EVE: extends CRESS-GLUE through the incorporation of CRESS DNA virus-derived endogenous viral elements (EVEs).

Table of Contents

Key Features

  • GLUE Framework Integration: Built on the GLUE software framework, CRESS-GLUE offers an extensible platform for efficient, standardized, and reproducible computational genomic analysis of CRESS DNA viruses.

  • Phylogenetic Structure: Sequence data in CRESS-GLUE is organized in a phylogenetically-structured manner, allowing users to explore evolutionary relationships easily.

  • Rich Annotations: Annotated reference sequences enable rigorous comparative genomic analysis related to conservation, adaptation, structural context, and genotype-to-phenotype associations.

  • Reproducibility: Ensures fully reproducible analyses through data standards and a relational database.

  • Reusable Data Objects: High-value data items such as multiple sequence alignments are prepared once and reused.

  • Validation: Enforces high data integrity through cross-validation.

  • Standardisation of Genomic Coordinates: All sequences use the coordinate space of a chosen reference sequence.

  • Predefined Reference Sequences: Includes fully-annotated reference sequences for CRESS species.

  • Alignment Trees: Links alignments constructed at distinct taxonomic levels, maintaining a standardised coordinate system.

Installation

If you have not done so already, install the GLUE software framework by following the installation instructions on the GLUE web site:

Download the CRESS-GLUE repository, navigate into the top-level directory, and start the GLUE command line interpreter.

Steps

  1. Build the Core Project:
   Mode path: /
   GLUE> run file buildCoreProject.glue

This will build the base project, which contains a minimal set of feature definitions, clade categories, reference sequences, and alignments.

Usage

GLUE contains an interactive command line environment focused on the development and use of GLUE projects by bioinformaticians. This provides a range of productivity-oriented features such as automatic command completion, command history and interactive paging through tabular data.

For detailed instructions on how to use CRESS-GLUE for your comparative genomic analysis, refer to the GLUE's reference documentation.

Data Sources

CRESS-GLUE relies on the following data sources:

Contributing

We welcome contributions from the community! If you're interested in contributing to CRESS-GLUE, please review our Contribution Guidelines.

Contributor Covenant

License

The project is licensed under the GNU Affero General Public License v. 3.0

Contact

For questions, issues, or feedback, please open an issue on the GitHub repository.

Releases

No releases published

Packages

No packages published