Generating kmers from annotation files #19

drtamermansour · 2021-02-24T01:31:53Z

Annotation files include GFF, GTF and BED files. We can use any of these files to generate k-mers in 3 main scenarios:

If these files are annotation of transcriptomes: We can use gffread (for GFF or GTF files) or getfasta from bedtools (for GFF or BED files). Note: getfasta in bedtools has 2 related arguments (-split and -rna). We to examine their effect carefully
If the user does not want splicing to happen.
a. If we have a BED file that annotation genomic blocks: getfasta from bedtools is straightforward
b. If we have transcriptome annotation file but the user needs each exon as a separate entry: We need to convert the GFF or GTF to BED then we can use getfasta from bedtools as in (a).

## gffread can convert GFF to GTF  
gffread example.gff  -T -o example.gtf

##  UCSC_kent_commands has a binary tool to convert gtf to GenePred format 
wget https://github.com/drtamermansour/horse_trans/raw/master/scripts/UCSC_kent_commands/gtfToGenePred
chmod +x gtfToGenePred
./gtfToGenePred example.gtf example.gpred

## I have script that I got from somewhere I do not remember to convert GenePred to BED file
wget https://raw.githubusercontent.com/drtamermansour/horse_trans/master/scripts/genePredToBed
chmod +x genePredToBed
cat example.gpred | ./genePredToBed > example.bed

If we have transcriptome annotation file but the user needs to generate k-mers from non-exonic structures (e.g. introns, upstream sequences, downstream sequences, exon-exon junctions: We can transform the annotation files to BED files then we need to create a simple script to transform this transcriptome BED file into another BED file that represent the target loci of the user

The text was updated successfully, but these errors were encountered:

mr-eyes added the new feature label Feb 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating kmers from annotation files #19

Generating kmers from annotation files #19

drtamermansour commented Feb 24, 2021 •

edited

Generating kmers from annotation files #19

Generating kmers from annotation files #19

Comments

drtamermansour commented Feb 24, 2021 • edited

drtamermansour commented Feb 24, 2021 •

edited