Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault on Re-aligning chimeric reads to filter fusions #195

Open
MelanieBroutin opened this issue Jun 2, 2023 · 3 comments
Open

Comments

@MelanieBroutin
Copy link

Hello,

I have an error Segmentation fault when I run arriba v2.4.0. I tried to debug with gdb.

Below, you can find all the logs.

(gdb) run -x /data/sample.bam -o /data/sample_arriba_standard.tsv -O /data/sample_arriba_discarded_standard.tsv -a /data/GRCh37.fa -g /data/ensembl.gtf.gz -b /data/blacklist.tsv.gz -u

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[2023-06-02T11:55:46] Launching Arriba 2.4.0
[2023-06-02T11:55:46] Loading assembly from '/data/GRCh37.fa' 
[2023-06-02T11:55:54] Loading annotation from '/data/ensembl.gtf.gz' 
[2023-06-02T11:55:57] Reading chimeric alignments from '/data/sample.bam' (total=WARNING: 236 SAM records were malformed and ignored
16969)
[2023-06-02T11:55:58] Marking multi-mapping alignments (marked=4156)
[2023-06-02T11:55:58] Detecting strandedness (no)
[2023-06-02T11:55:58] Annotating alignments 
[2023-06-02T11:55:58] Filtering duplicates (remaining=16969)
[2023-06-02T11:55:58] Filtering mates which do not map to interesting contigs (1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y AC_* NC_*) (remaining=16961)
[2023-06-02T11:55:58] Filtering mates which only map to viral contigs (AC_* NC_*) (remaining=16961)
[2023-06-02T11:55:58] Filtering viral contigs with expression lower than the top 5 (remaining=16961)
[2023-06-02T11:55:58] Filtering viral contigs with less than 5% coverage (remaining=16961)
[2023-06-02T11:55:58] Estimating fragment length WARNING: not enough chimeric reads to estimate mate gap distribution, using default values
[2023-06-02T11:55:58] Filtering read-through fragments with a distance <=10000bp (remaining=15815)
[2023-06-02T11:55:58] Filtering inconsistently clipped mates (remaining=15685)
[2023-06-02T11:55:58] Filtering breakpoints adjacent to homopolymers >=6nt (remaining=15670)
[2023-06-02T11:55:58] Filtering fragments with small insert size (remaining=15446)
[2023-06-02T11:55:58] Filtering alignments with long gaps (remaining=15446)
[2023-06-02T11:55:58] Filtering fragments with both mates in the same gene (remaining=15444)
[2023-06-02T11:55:58] Filtering fusions arising from hairpin structures (remaining=15110)
[2023-06-02T11:55:58] Filtering reads with a mismatch p-value <=0.01 (remaining=2292)
[2023-06-02T11:55:58] Filtering reads with low entropy (k-mer content >=60%) (remaining=1850)
[2023-06-02T11:55:58] Finding fusions and counting supporting reads (total=WARNING: some fusions were subsampled, because they have more than 300 supporting reads
1983)
[2023-06-02T11:55:58] Merging adjacent fusion breakpoints (remaining=1980)
[2023-06-02T11:55:58] Filtering multi-mapping fusions by alignment score and read support (remaining=1795)
[2023-06-02T11:55:58] Estimating expected number of fusions by random chance (e-value) 
[2023-06-02T11:55:58] Filtering fusions with both breakpoints in adjacent non-coding/intergenic regions (remaining=1749)
[2023-06-02T11:55:58] Filtering intragenic fusions with both breakpoints in exonic regions (remaining=1696)
[2023-06-02T11:55:58] Filtering fusions with <2 supporting reads (remaining=62)
[2023-06-02T11:55:58] Filtering fusions with an e-value >=0.3 (remaining=50)
[2023-06-02T11:55:58] Searching for internal tandem duplications <=100bp with >=10 supporting reads and >=7% allele fraction (remaining=51)
[2023-06-02T11:55:58] Filtering fusions with both breakpoints in intronic/intergenic regions (remaining=37)
[2023-06-02T11:55:58] Filtering in vitro-generated fusions between genes with an expression above the 99.8% quantile (remaining=15)
[2023-06-02T11:55:58] Searching for fusions with spliced split reads (remaining=15)
[2023-06-02T11:55:58] Selecting best breakpoints from genes with multiple breakpoints (remaining=13)
[2023-06-02T11:55:58] Filtering read-through fusions with breakpoints near the gene boundary (remaining=13)
[2023-06-02T11:55:58] Searching for fusions with >=4 spliced events (remaining=13)
[2023-06-02T11:55:58] Filtering blacklisted fusions in '/data/blacklist.tsv.gz' (remaining=WARNING: unknown gene or malformed range: 2:87565634-87566158     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:87565634-87566158     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:87565634-87566158     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:91679087-91679734     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:91679087-91679734     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:91679087-91679734     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:92005797-92006650     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:92005797-92006650     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:92005797-92006650     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:97712325-97712628     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:97712325-97712628     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:97712325-97712628     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:97716466-97716765     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:97716466-97716765     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:97716466-97716765     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:97725682-97726152     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:97725682-97726152     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:97725682-97726152     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:97987331-97987617     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:97987331-97987617     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:97987331-97987617     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:97996726-97997025     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:97996726-97997025     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:97996726-97997025     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98000864-98001167     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98000864-98001167     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98000864-98001167     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98018364-98018640     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98018364-98018640     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98018364-98018640     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98024520-98024800     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98024520-98024800     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98024520-98024800     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98037725-98038028     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98037725-98038028     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98037725-98038028     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98041867-98042166     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98041867-98042166     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98041867-98042166     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:98051276-98051562     22	22380474-23922495
WARNING: unknown gene or malformed range: 2:98051276-98051562     2	89156674-90458671
WARNING: unknown gene or malformed range: 2:98051276-98051562     14	106053226-107288019
WARNING: unknown gene or malformed range: 2:114163973-114164486   22	22380474-23922495
WARNING: unknown gene or malformed range: 2:114163973-114164486   2	89156674-90458671
WARNING: unknown gene or malformed range: 2:114163973-114164486   14	106053226-107288019
WARNING: unknown gene or malformed range: 8:48114067-48114520     22	22380474-23922495
WARNING: unknown gene or malformed range: 8:48114067-48114520     2	89156674-90458671
WARNING: unknown gene or malformed range: 8:48114067-48114520     14	106053226-107288019
WARNING: unknown gene or malformed range: 9:69777099-69777953     22	22380474-23922495
WARNING: unknown gene or malformed range: 9:69777099-69777953     2	89156674-90458671
WARNING: unknown gene or malformed range: 9:69777099-69777953     14	106053226-107288019
WARNING: unknown gene or malformed range: 9:70394238-70395090     22	22380474-23922495
WARNING: unknown gene or malformed range: 9:70394238-70395090     2	89156674-90458671
WARNING: unknown gene or malformed range: 9:70394238-70395090     14	106053226-107288019
WARNING: unknown gene or malformed range: 10:42680787-42681239    22	22380474-23922495
WARNING: unknown gene or malformed range: 10:42680787-42681239    2	89156674-90458671
WARNING: unknown gene or malformed range: 10:42680787-42681239    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20169919-20170354    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20169919-20170354    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20169919-20170354    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20178035-20178471    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20178035-20178471    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20178035-20178471    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20192909-20193370    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20192909-20193370    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20192909-20193370    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20209093-20209115    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20209093-20209115    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20209093-20209115    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20210050-20210068    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20210050-20210068    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20210050-20210068    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20211158-20211188    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20211158-20211188    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20211158-20211188    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20213655-20213685    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20213655-20213685    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20213655-20213685    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20216406-20216422    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20216406-20216422    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20216406-20216422    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:20844768-20845194    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:20844768-20845194    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:20844768-20845194    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:21215823-21215845    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:21215823-21215845    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:21215823-21215845    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:21216780-21216798    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:21216780-21216798    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:21216780-21216798    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:21217888-21217918    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:21217888-21217918    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:21217888-21217918    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:21220377-21220407    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:21220377-21220407    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:21220377-21220407    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:21223129-21223145    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:21223129-21223145    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:21223129-21223145    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:22448382-22448819    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:22448382-22448819    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:22448382-22448819    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:22466058-22466493    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:22466058-22466493    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:22466058-22466493    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:22472918-22473353    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:22472918-22473353    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:22472918-22473353    14	106053226-107288019
WARNING: unknown gene or malformed range: 15:22482836-22483268    22	22380474-23922495
WARNING: unknown gene or malformed range: 15:22482836-22483268    2	89156674-90458671
WARNING: unknown gene or malformed range: 15:22482836-22483268    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32046174-32046647    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32046174-32046647    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32046174-32046647    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32070405-32070693    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32070405-32070693    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32070405-32070693    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32077386-32077679    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32077386-32077679    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32077386-32077679    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32859034-32859477    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32859034-32859477    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32859034-32859477    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32914763-32915215    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32914763-32915215    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32914763-32915215    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32926395-32926857    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32926395-32926857    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32926395-32926857    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:32989782-32990212    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:32989782-32990212    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:32989782-32990212    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33006369-33006826    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33006369-33006826    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33006369-33006826    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33013654-33013942    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33013654-33013942    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33013654-33013942    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33020496-33020941    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33020496-33020941    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33020496-33020941    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33605231-33605684    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33605231-33605684    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33605231-33605684    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33629681-33630128    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33629681-33630128    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33629681-33630128    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33661363-33661819    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33661363-33661819    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33661363-33661819    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33740804-33741266    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33740804-33741266    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33740804-33741266    14	106053226-107288019
WARNING: unknown gene or malformed range: 16:33752443-33752887    22	22380474-23922495
WARNING: unknown gene or malformed range: 16:33752443-33752887    2	89156674-90458671
WARNING: unknown gene or malformed range: 16:33752443-33752887    14	106053226-107288019
WARNING: unknown gene or malformed range: 18:3394887-3395310      22	22380474-23922495
WARNING: unknown gene or malformed range: 18:3394887-3395310      2	89156674-90458671
WARNING: unknown gene or malformed range: 18:3394887-3395310      14	106053226-107288019
WARNING: unknown gene or malformed range: 21:10862622-10863067    22	22380474-23922495
WARNING: unknown gene or malformed range: 21:10862622-10863067    2	89156674-90458671
WARNING: unknown gene or malformed range: 21:10862622-10863067    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:17385109-17385586    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:17385109-17385586    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:17385109-17385586    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:17395125-17395871    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:17395125-17395871    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:17395125-17395871    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:17402333-17403063    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:17402333-17403063    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:17402333-17403063    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:17406842-17407334    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:17406842-17407334    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:17406842-17407334    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:17414411-17414893    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:17414411-17414893    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:17414411-17414893    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:25833273-25833790    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:25833273-25833790    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:25833273-25833790    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:32595906-32596221    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:32595906-32596221    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:32595906-32596221    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:32752663-32752975    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:32752663-32752975    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:32752663-32752975    14	106053226-107288019
WARNING: unknown gene or malformed range: 22:32772669-32773122    22	22380474-23922495
WARNING: unknown gene or malformed range: 22:32772669-32773122    2	89156674-90458671
WARNING: unknown gene or malformed range: 22:32772669-32773122    14	106053226-107288019
8)
[2023-06-02T11:56:01] Filtering fusions with anchors <=23nt (remaining=6)
[2023-06-02T11:56:01] Filtering end-to-end fusions with low support (remaining=5)
[2023-06-02T11:56:01] Filtering fusions with no coverage around the breakpoints (remaining=2)
[2023-06-02T11:56:01] Indexing gene sequences 
[2023-06-02T11:56:01] Filtering genes with >=30% identity (remaining=2)
[2023-06-02T11:56:01] Re-aligning chimeric reads to filter fusions with >=80% mis-mappers 
Program received signal SIGSEGV, Segmentation fault.
std::_Hashtable<unsigned int, std::pair<unsigned int const, std::vector<int, std::allocator<int> > >, std::allocator<std::pair<unsigned int const, std::vector<int, std::allocator<int> > > >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node (__code=55405, __k=<optimized out>, __bkt=55405, this=0x55557a64a2b8) at /usr/include/c++/11/bits/hashtable.h:1827
1827	    _Hashtable<_Key, _Value, _Alloc, _ExtractKey, _Equal,
(gdb) bt 
#0  std::_Hashtable<unsigned int, std::pair<unsigned int const, std::vector<int, std::allocator<int> > >, std::allocator<std::pair<unsigned int const, std::vector<int, std::allocator<int> > > >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_before_node (__code=55405, __k=<optimized out>, __bkt=55405, this=0x55557a64a2b8) at /usr/include/c++/11/bits/hashtable.h:1827
#1  std::_Hashtable<unsigned int, std::pair<unsigned int const, std::vector<int, std::allocator<int> > >, std::allocator<std::pair<unsigned int const, std::vector<int, std::allocator<int> > > >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::_M_find_node (__c=55405, __key=<optimized out>, __bkt=55405, this=0x55557a64a2b8) at /usr/include/c++/11/bits/hashtable.h:810
#2  std::_Hashtable<unsigned int, std::pair<unsigned int const, std::vector<int, std::allocator<int> > >, std::allocator<std::pair<unsigned int const, std::vector<int, std::allocator<int> > > >, std::__detail::_Select1st, std::equal_to<unsigned int>, std::hash<unsigned int>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<false, false, true> >::find (__k=<optimized out>, this=0x55557a64a2b8) at /usr/include/c++/11/bits/hashtable.h:1610
#3  std::unordered_map<unsigned int, std::vector<int, std::allocator<int> >, std::hash<unsigned int>, std::equal_to<unsigned int>, std::allocator<std::pair<unsigned int const, std::vector<int, std::allocator<int> > > > >::find (__x=<optimized out>, this=0x55557a64a2b8) at /usr/include/c++/11/bits/unordered_map.h:880
#4  align (score=score@entry=0, read_sequence="AGCTGCAGCAGCTGCCCAGGCAGCAGCCGTGGCAGGAAACATCCCTGGCCCAGGATCAGTAGGTGGAATAGCTCCAGCTATCA", read_pos=read_pos@entry=0, contig_sequence=..., 
    gene_pos=gene_pos@entry=48494088, gene_start=gene_start@entry=48494088, gene_end=48584813, kmer_index=std::unordered_map with 93825612994296 elements = {...}, kmer_length=8 '\b', 
    splice_sites=std::set with 9 elements = {...}, min_score=66, max_deletions=1) at source/filter_mismappers.cpp:96
#5  0x00005555555c64aa in align_both_strands (read_sequence="AGCTGCAGCAGCTGCCCAGGCAGCAGCCGTGGCAGGAAACATCCCTGGCCCAGGATCAGTAGGTGGAATAGCTCCAGCTATCA", read_length=100, max_mate_gap=200, 
    breakpoints_on_same_contig=true, alignment_start=48603063, alignment_end=48603145, kmer_indices=std::vector of length 15, capacity 15 = {...}, assembly=std::unordered_map with 24 elements = {...}, 
    exon_annotation_index=..., splice_sites_by_gene=std::unordered_map with 1 element = {...}, genes=..., kmer_length=8 '\b', min_align_fraction=<optimized out>) at source/filter_mismappers.cpp:210
#6  0x00005555555c70b7 in filter_mismappers (fusions=std::unordered_map with 16920 elements = {...}, kmer_indices=std::vector of length 15, capacity 15 = {...}, kmer_length=kmer_length@entry=8 '\b', 
    assembly=std::unordered_map with 24 elements = {...}, exon_annotation_index=..., max_mismapper_fraction=<optimized out>, max_mate_gap=max_mate_gap@entry=200) at source/filter_mismappers.cpp:292
#7  0x0000555555569e17 in main (argc=<optimized out>, argv=<optimized out>) at source/arriba.cpp:564

Do you know why I have this error ?

Thank you

@MelanieBroutin MelanieBroutin changed the title Segmentation fault on Re-aligning chimeric reads to filter fusions with >=80% mis-mappers Segmentation fault on Re-aligning chimeric reads to filter fusions Jun 2, 2023
@suhrig
Copy link
Owner

suhrig commented Jun 2, 2023

Thanks for taking the time to create a stack trace. That's very helpful.

A few things about Arriba's output look odd:

  • The reference files have non-standard names (GRCh37.fa, ensembl.gtf). Also, Arriba complains about malformed blacklist entries. While it is possible to use your own reference files, it may be that you accidentally mixed up files or generated corrupted files. Perhaps the BAM file and reference files have incompatible coordinate systems (GRCh37 vs. GRCh37). Please double-check that your files are compatible or try the reference files created by the script download_referencrs.sh.
  • The step triggering the segfault only takes two fusion candidates as input. Given this small number, my guess is that the segfault is triggered on any input. It's not like the step has to process thousands of fusions, one of which triggers a rare bug that went unnoticed in my tests. So something is generally broken with your installation I guess. Have you been able to process any sample successfully?
  • The segfault is triggered in a piece of code that is external to Arriba. Again, this indicates that your installation may be broken. Have you compiled Arriba yourself? Maybe something went wrong during compilation. You might want to try the precompiled binary or Docker/Singularity on the problematic sample. See if that works ...
  • The sample has very few reads. What kind of sample is this? This shouldn't be a problem, but would help me understand the situation.

@MelanieBroutin
Copy link
Author

MelanieBroutin commented Jun 5, 2023

Thanks for this quick reply.
I am used to use arriba in routine. So I processed a lot of samples already with this version of arriba and without any issue. I used the same reference files as always.
I use the arriba provided in arriba.tar.gz in https://github.com/suhrig/arriba/releases/tag/v2.4.0. I have compiled arriba by myself only for debugging it with gdb.
The sample is an Illumina RNAseq with UMI. I could not share the sample with you as it is patient data.

@suhrig
Copy link
Owner

suhrig commented Mar 9, 2024

Hi Melanie,

This completely slipped my attention, sorry. Are you still interested in getting to the bottom of this? Since you cannot share the data, I would need your cooperation, meaning possibly a few rounds of running a modified Arriba with debug logging where each iteration gets us closer to the root cause. Alternatively, if you can reproduce the error on a minimal dataset, for example on just the reads around the breakpoint which triggers the error, this should be harmless to share since there is only little likelihood that identifiable information would be shared. I could send you commands to extract the reads from around the fusion breakpoints if you like. Let me know which option you prefer.

Best regards,
Sebastian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants