Skip to content

Multithreading, data handling, and file manipulation for the GTExSnake pipeline

License

Notifications You must be signed in to change notification settings

IMS-Bio2Core-Facility/GTExQuery

Repository files navigation

GTExQuery

MIT License Python 3.9 Status: Active CI/CD codecov Documentation Status Codestyle: Black PyPI

:class: tip

This repository houses the code, tests, etc.,
that run the nuts and bolts of the Snakemake pipeline
[GTExSnake][GTExSnake]

GTExSnake is a fully concurrent pipeline for querying transcript-level GTEx data in specific tissues. This package handles all the code needed for multithreading, data handling, and file manipulation necessary for so-said pipeline.

If you find the project useful, leaves us a star on github!

If you want to contribute, please see the guide on contributing

Motivation

There are a number of circumstances where transcript level expressed data for a specific tissue is highly valuable. For tissue-dependent expression data, there are few resources better than GTEx. In this case, the medianTranscriptExpression query provides the necessary data. It returns the median expression of each transcript for a gene in a given tissue.

As the code and tests necessary to handle the multithreading and data grew, maintaining both the pipeline and the source code in a single repository became quite the challenge. To help alleviate this, it was decided to refactor the source code into its own repository, allowing both the pipeline and the code to more easily adhere to best practices.

Further Information

For more information about the source code, see our documentation on ReadTheDocs. You can learn more about the pipeline this code supports here.