Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic inconsistent errors with xtea_long #93

Open
yuliamostovoy opened this issue Oct 31, 2023 · 5 comments
Open

Sporadic inconsistent errors with xtea_long #93

yuliamostovoy opened this issue Oct 31, 2023 · 5 comments

Comments

@yuliamostovoy
Copy link

Hi,

I'm trying to run xTea_long on the cloud (Terra) and I'm seeing weird issues where I run it on exactly the same data with the same command and configuration and sometimes it fails and other times it completes successfully. The failures seem to be related to generating the "all_ins_seqs.fa" file, although I'm finding it hard to tell exactly what's going wrong from the logs, and I'm confused as to why it's inconsistent. I tried adding a lot of RAM in case that was the issue, but the errors are still happening. I'm attaching the logs from a few failed runs and two successes on the same data. I'd like to run xTea_long on a number of samples, so I'd really appreciate any help in resolving this. Thanks!
xtea_logs.tar.gz

@simoncchu
Copy link
Collaborator

simoncchu commented Oct 31, 2023

Inside the failed ones, there are earlier errors like:

multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/conda/envs/xtea/lib/python3.9/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/opt/conda/envs/xtea/lib/python3.9/multiprocessing/pool.py", line 48, in mapstar
return list(map(*args))
File "/xTea/xtea_long/x_reference.py", line 50, in unwrap_gnrt_flank_regions
return XReference.run_gnrt_flank_region_for_chrm(*arg, **kwarg)
File "/xTea/xtea_long/x_reference.py", line 109, in run_gnrt_flank_region_for_chrm
s_left_region = f_fa.fetch(ref_chrm, istart, pos)
File "pysam/libcfaidx.pyx", line 301, in pysam.libcfaidx.FastaFile.fetch
KeyError: "sequence '2' not present"
"""

The code is trying to extract the flanking sequences of a region on chromosome "2" and failed, but before that there is code checking the reference one is "chr2" or "2", so I am not sure why there is an error reported there.

Is it possible to share with me the workspace for a testing?

@yuliamostovoy
Copy link
Author

Thanks, yes, I can share the workspace (testing with public data). Should I use your listed gmail account?

@yuliamostovoy
Copy link
Author

@simoncchu I shared a couple of workspaces - I think the second has data that you should be able to see directly without any further permission-wrangling. Let me know if there are any issues, and thanks!

@simoncchu
Copy link
Collaborator

simoncchu commented Nov 13, 2023

I found the shared workspaces. I checked the failed jobs, and still cannot find out why the errors were triggered. I will download one of the smaller bam and test locally and get back to you.

@yuliamostovoy
Copy link
Author

Thank you, I really appreciate it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants