Skip to content

Extracting the abundances for a single gene

Sam Minot edited this page Apr 4, 2022 · 1 revision

It can be difficult some times to load data from an HDF5 file. To help make this process a bit easier, we have made a small Python script that will read the abundances measured for a single gene from a geneshot results HDF5 file and save it as a CSV.

You can find the script here: https://gist.github.com/sminot/cebfbd84d57406b5b41b2eebffb1789f

To use it, simply save the file as geneshot_extract_gene_abund.py and then run:

geneshot_extract_gene_abund.py --details-hdf5 DETAILS.hdf5

(where DETAILS.hdf5 is the *.details.hdf5 file produced in your geneshot output folder)

The script will then ask you for the name of the gene which you want to inspect.

The measured abundances of that gene across all specimens will be saved to a file named for that gene in the current working directory.