New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixed error for teflon_collapse.py but getting an error in teflon_genotype.py #16
Comments
Hi, I am no longer getting this error but the genotype folder is empty despite the program saying it's finished. Below is my joblog: (teflon_env) [hyangg@n1826 TEFLON]$ python ./teflon_genotype.py \
|
I solved the issue - RepeatMasker annotates for satellite repeats and low-complexity repeats in addition to TEs, but these were not in my TE hierarchy text. You can probably find a way to include them in the workflow but once I removed those from my TE reference bed file and re-ran the reference prep step everything worked smoothly. |
Hi,
Thank you for making TEFLoN! I ran into an issue where teflon_collapse.py was not grabbing the number of raw total sequences and was rather grabbing the last element of the line that included "raw total sequences:". I resolved the issue by removing the comment at the end of that line
sed 's/# excluding supplementary and secondary reads//1' sample.stats.txt
, but just a heads up for those using the script and running into this error:Traceback (most recent call last): File "./teflon_collapse.py", line 165, in <module> main() File "./teflon_collapse.py", line 89, in main total_n=int(l.split()[-1]) ValueError: invalid literal for int() with base 10: 'reads'
I have now gotten to the teflon_genotype.py script and am getting the following error:
Lower-bound coverage threshold filters corresponding to samples ['ANG.5'] is [1] NOTE: all sites with adjusted read counts > upper-bound coverage threshold will be marked -9 Upper-bound coverage threshold filters corresponding to samples ['ANG.5'] is [102] NOTE: all sites with adjusted read counts > upper-bound coverage threshold will be marked -9 cdm: gunzip -c /u/home/h/hyangg/project-vlsork/TEFLON/qlob.prep_TF/qlob.pseudo2ref.pickle.gz > /u/home/h/hyangg/project-vlsork/TEFLON/qlob.prep_TF/qlob.pseudo2ref.pickle.gz.tmp loading pickle: /u/home/h/hyangg/project-vlsork/TEFLON/qlob.prep_TF/qlob.pseudo2ref.pickle.gz.tmp NOTE: this step can be time and memory intensive for large reference genomes pickle loaded! Converting coordinates from pseudospace to reference-based coordinates... Traceback (most recent call last): File "./teflon_genotype.py", line 122, in <module> main() File "./teflon_genotype.py", line 116, in main pt.pt_portal(countDir,genoDir,samples, posMap, stats, p2rC, l_thresh, h_thresh) File "/u/project/vlsork/hyangg/TEFLON/teflon_scripts/genotyper_poolType.py", line 51, in pt_portal p2rC.pseudo2refConvert_portal(outFILE1,posMap,outFILE2) File "/u/project/vlsork/hyangg/TEFLON/teflon_scripts/pseudo2refConvert.py", line 26, in pseudo2refConvert_portal with open(bedFILE, 'r') as fIN, open(outFILE, 'w') as fOUT: IOError: [Errno 2] No such file or directory: '/u/home/h/hyangg/project-vlsork/TEFLON/genotypes/ANG.5.genotypes.txt'
Any suggestions on how to resolve this? It seems that it's not finding the genotypes.txt file, which means there's an error in producing it. Thanks for your help in advance!
Best,
Heidi
The text was updated successfully, but these errors were encountered: