Skip to content

Releases: suhrig/arriba

Arriba v2.4.0

08 Feb 10:48
Compare
Choose a tag to compare
  • new utility script to annotate exon numbers
  • compatibility with Illumina's Dragen aligner (see notes in manual about supported aligners)
  • retained fraction of protein domain was often overestimated as 100%
  • better agreement of transcripts between Arriba's output file and the visualizations produced by draw_fusions.R by making --transcriptSelection=provided the default
  • better matching of structural variant breakpoints to fusion breakpoints when parameter -d is used
  • VCF files generated by scripts/convert_fusions_to_vcf.sh are now compatible with bcftools
  • mildly improved filtering

Arriba v2.3.0

29 May 10:42
Compare
Choose a tag to compare
  • blacklist PhiX genome, since it is often used as spike-in control
  • stricter filtering of read-through fusions
  • fix broken compilation due to outdated zlib URL (thanks to @iainrb)
  • updated protein domain annotation files (GFF3), now with 7-15% more annotation records
  • updated reference files in download_references.sh to match protein domain annotation version
  • download_references.sh did not properly harmonize chromosome names between assembly (FastA) and annotation (GTF) when an assembly with chr prefix was used (hg19/38, mm10/39), which had minor implications on alignment and fusion calling
  • coverage plots can be scaled separately and/or to a user-defined cutoff (--coverageRange=...)
  • scripts are now compatible with macOS (a recent version of bash must be installed, though; the preinstalled version 3.2 is too old)
  • minor fixes for reading frame prediction when breakpoint is close to first/last exon

Arriba v2.2.1

22 Jan 15:16
Compare
Choose a tag to compare
  • reverted a change introduced in v2.2.0: download_references.sh now uses the ENSEMBL GRCh38 assembly (FastA) again instead of the ICGC-ARGO assembly, because the latter contains ALT contigs, which is not recommended for alignment using STAR according to the STAR user manual; moreover, due to scripting error, the GRCh38 assembly generated by downloaded_references.sh contained malformed data at the end of the file, which is now fixed as well

Arriba v2.2.0

16 Jan 13:03
Compare
Choose a tag to compare
  • improved detection of internal tandem duplications
  • better sensitivity for the detection of viral integration sites
  • inclusion of additional ~4500 viruses into screening, including rare strains of cancer-associated viruses (requires rebuild of STAR index)
  • viral contigs were renamed to be compliant with the SAM format specification (requires rebuild of STAR index)
  • support for mm39/GRCm39
  • utility scripts (see also manual):
    • quantify virus expression
    • convert Arriba's custom output format to VCF
    • extract fusion-supporting alignments into separate mini-BAM
    • running Arriba on a prealigned BAM file and realigning only the fusion candidate reads saves ~80% of the CPU time compared to a complete realignment (useful when the alignments were generated by an old STAR version or by a different aligner such as HISAT2)
  • polishing of fusion visualizations created by draw_fusions.R and new features:
    • all transcripts can be drawn at the same scale if desired (--fixedScale)
    • circos plots have same size across all pages
    • set PDF title and print as header on every page (--sampleName)
    • fine-grained control over region to draw for intergenic breakpoints (--showIntergenicVicinity)
    • choose a different font (--fontFamily)
    • better scaling for coverage track
  • more fixes for prediction of reading frame
  • better warnings and error messages
  • updated STAR to version 2.7.10a, which fixes malformed chimeric alignments for paired-end reads with small insert size
  • updated dependencies (HTSlib, libdeflate)

Arriba v2.1.0

24 Jan 15:31
Compare
Choose a tag to compare
  • Arriba can now be cited
  • arcs in circos plot are colored by type of rearrangement
  • internal tandem duplications are flagged with the keyword ITD in Arriba's output file
  • more effective filtering of germline polymorphism internal tandem duplications
  • draw_fusions.R loads reference files faster
  • under some rare conditions, the reading frame was erroneously predicted as out-of-frame

Arriba v2.0.0

11 Oct 13:54
Compare
Choose a tag to compare
  • report viral integration sites
  • report fusions supported by multi-mapping reads (e.g., CIC-DUX4, NPM1-ALK)
  • report internal tandem duplications (e.g., FLT3, BCOR, ERBB2, NOTCH1)
  • improved detection of IG/TCR rearrangements
  • known fusions file based on the Mitelman database is now part of the download
  • more comprehensive annotation (gene IDs, transcript IDs, user-defined tags, retained protein domains)
  • support for mouse (mm10)
  • (optionally) report the full transcript/peptide sequence (parameter -I) rather than only what can be assembled from the supporting reads
  • structural variants can be supplied in VCF format (parameter -d)
  • MacOS support
  • faster loading of BAM files thanks to HAT-trie map as well as other speed improvements
  • draw_fusions.R accepts the format of STAR-Fusion
  • ability to make use of external duplicate marking, e.g., for UMIs (parameter -u)
  • enhanced blacklist
  • simplified code compilation procedure
  • support assemblies with up to 65,000 contigs (previously 32,000)

Important compatibility notes when upgrading from version 1.x:

  • STAR version >= 2.7.6a is required to make use of multi-mapping chimeric reads
  • new columns were added to the output files and some were rearranged
  • the parameter -P is obsolete; the parameters -I and -T have been repurposed
  • parsing of input TSV files (GTF, known fusions, blacklist, structural variants) is now stricter
  • the order of the genes in the known fusions file (parameter -k) is now important
  • the reading_frame column may contain the new value stop-codon
  • the site1/2 columns may contain new values
  • the parameters of the run_arriba.sh script have changed
  • the download_references.sh script is now parameterized using environment variables
  • the chr prefix is no longer removed from the output files
  • the alignment parameters of run_arriba.sh are set to report up to 50 multi-mapping reads
  • some filters were removed/renamed, which is relevant if the parameter -f is used

Arriba v1.2.0

04 Jan 13:28
Compare
Choose a tag to compare
  • better filtering of in vitro-generated artifacts
  • known_fusions filter is more sensitive
  • update dependencies (HTSlib, compression libs)
  • example data
  • under some (rare) conditions, reading frame was incorrect
  • documentation provides tips on how to interpret fusion predictions
  • better error messages for common cases of incorrect usage

Arriba v1.1.0

25 Mar 15:35
Compare
Choose a tag to compare
  • speed improvements (BAM file loading, low_entropy filter, homologs filter, GTF parsing)
  • prebuilt Docker image available at Docker Hub
  • installation via bioconda
  • improved confidence scoring
  • better detection of intragenic rearrangements
  • new blacklist
  • fix some non-deterministic behavior
  • more reliable auto-detection of strandedness
  • protein domains were drawn in incorrect order for genes on the reverse strand
  • handle empty input files more reasonably

Arriba v1.0.1

23 Oct 22:41
Compare
Choose a tag to compare
  • fix bugs in parsing of command-line arguments
  • do not compress intermediate (unsorted) BAM file to avoid performance bottleneck

Arriba v1.0.0

14 Oct 11:20
Compare
Choose a tag to compare