Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues during checkpoints 2/3 #71

Open
Maxim-Karpov opened this issue Mar 13, 2023 · 2 comments
Open

Issues during checkpoints 2/3 #71

Maxim-Karpov opened this issue Mar 13, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@Maxim-Karpov
Copy link

Hello,
I've encountered 2 different issues when running Finder on 2 separate genomes.

INFO: Creating SIF file...
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:350: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
coverage_info[transcript_id]["bed_cov"] = np.array( temp )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: invalid value encountered in double_scalars
ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: invalid value encountered in double_scalars
ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: divide by zero encountered in double_scalars
ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: divide by zero encountered in double_scalars
ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 )
Traceback (most recent call last):
File "/softwares/FINDER/Finder/finder", line 688, in
main()
File "/softwares/FINDER/Finder/finder", line 649, in main
orchestrateGeneModelPrediction( options, logger_proxy, logging_mutex )
File "/softwares/FINDER/Finder/finder", line 491, in orchestrateGeneModelPrediction
fixOverlappingAndMergedTranscripts( options, logger_proxy, logging_mutex )
File "/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py", line 740, in fixOverlappingAndMergedTranscripts
exon = list( map( int, exon.split( "-" ) ) )
ValueError: invalid literal for int() with base 10: '1e+05'

I believe the line 740 in fixOverlappingAndMergedTranscripts.py needs to be changed from exon = list( map( int, exon.split( "-" ) ) ) to exon = list( map( int, map(float, exon.split( "-" )) ) ) to fix this.

INFO: Creating SIF file...
cat: /home/Maxim/software/FINDER/output/ChrsSoftMask/alignments/SRR11184196_round3_SJ.out.tab: No such file or directory
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:350: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
coverage_info[transcript_id]["bed_cov"] = np.array( temp )
Warning: couldn't find fasta record for 'ENA_OX243811_OX243811'!
Error: no genomic sequence available (check -g option!).
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: divide by zero encountered in double_scalars
ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: invalid value encountered in double_scalars
ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:657: RuntimeWarning: invalid value encountered in double_scalars
ratio3 = round( np.average( coverage_3rd_portion ) / np.average( coverage_2nd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:656: RuntimeWarning: divide by zero encountered in double_scalars
ratio2 = round( np.average( coverage_2nd_portion ) / np.average( coverage_3rd_portion ), 2 )
/softwares/FINDER/Finder/scripts/fixOverlappingAndMergedTranscripts.py:655: RuntimeWarning: divide by zero encountered in double_scalars
ratio1 = round( np.average( coverage_2nd_portion ) / np.average( coverage_1st_portion ), 2 )
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.8/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/softwares/FINDER/Finder/scripts/removeRedundantTranscripts.py", line 22, in findSubsetTranscripts
if transcripts_fasta[transcript_i] in transcripts_fasta[transcript_j]:
KeyError: 'ENA_OX243811_OX243811.1_0_covsplit.0'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/softwares/FINDER/Finder/finder", line 688, in
main()
File "/softwares/FINDER/Finder/finder", line 649, in main
orchestrateGeneModelPrediction( options, logger_proxy, logging_mutex )
File "/softwares/FINDER/Finder/finder", line 500, in orchestrateGeneModelPrediction
removeRedundantTranscripts( input_gtf_filename, output_gtf_filename, options )
File "/softwares/FINDER/Finder/scripts/removeRedundantTranscripts.py", line 85, in removeRedundantTranscripts
results = pool.map( findSubsetTranscripts, all_inputs )
File "/usr/lib/python3.8/multiprocessing/pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/usr/lib/python3.8/multiprocessing/pool.py", line 771, in get
raise self._value
KeyError: 'ENA_OX243811_OX243811.1_0_covsplit.0'

I haven't found a possible solution for this issue. Hope you can patch these.

On a side note, the issues people are experiencing regarding empty psiclass and no combined gff files could be due to people not splitting their paired-end reads from reads.fastq to reads_1.fastq + reads_2.fastq with SRA toolkit's fastq-dump.

@sagnikbanerjee15
Copy link
Owner

Hello @Maxim-Karpov,

Thank you for your interest in finder. We are currently working on developing a different version of finder that will address most of the issues that are listed here. I do not anticipate making any further changes to the old version since the entire architecture of the new software will be hugely different.

Thanks for pointing out the issue with merged reads. We will definitely look into it.

Thank you.

@sagnikbanerjee15 sagnikbanerjee15 self-assigned this Mar 13, 2023
@sagnikbanerjee15 sagnikbanerjee15 added the bug Something isn't working label Mar 13, 2023
@Maxim-Karpov
Copy link
Author

Hello @Maxim-Karpov,

Thank you for your interest in finder. We are currently working on developing a different version of finder that will address most of the issues that are listed here. I do not anticipate making any further changes to the old version since the entire architecture of the new software will be hugely different.

Thanks for pointing out the issue with merged reads. We will definitely look into it.

Thank you.

Thanks for promptly getting back to me. When do you expect to release the new version of finder?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants