Releases: MultiQC/MultiQC
MultiQC Version 1.13
MultiQC updates
- Major spruce of the command line help, using the new rich-click package
- Drop some of the Python 2k compatability code (eg. custom requirements)
- Improvements for running MultiQC in a Python environment, such as a Jupyter Notebook or script
- Fixed bug raised when removing logging file handlers between calls that arose when configuring the root logger with dictConfig (#1643)
- Added new config option
custom_table_header_config
to override any config for any table header - Fixed edge-case bug in custom content where a
description
that doesn't terminate in.
gave duplicate section descriptions. - Tidied the verbose log to remove some very noisy statements and add summaries for skipped files in the search
- Add timezone to time in reports
- Add nix flake support
- Added automatic tweet about new releases
Module updates
- AdapterRemoval
- Finally merge a fix for counts of reads that are discarded/collapsed (#1647)
- VEP
- Fixed bug when
General Statistics
have a value of-
(#1656)
- Fixed bug when
- Custom content
- Nanostat
- Pangolin
- Updated module to handle outputs from Pangolin v4 (#1660)
- Somalier
- Handle zero mean X depth in Sex plot (#1670)
- Fastp
- Include low complexity and too long reads in filtering bar chart
- miRTop
- FastQC
- Fixed error when parsing duplicate ratio when there is
nan
values in the report. (#1725)
- Fixed error when parsing duplicate ratio when there is
MultiQC Version 1.12
Version 1.12 of MultiQC brings with it a modest number of new modules, a few new core features and a swathe of bugfixes and general improvements. I hope that everyone continues to find it useful! You can see the full changes in this release here: v1.11...v1.12
Special thanks to the 13 people who had their first MultiQC contributions in this release:
New Contributors
- @HofLucien made their first contribution in #1486
- @jchorl made their first contribution in #1578
- @bjohnnyd made their first contribution in #1489
- @schorlton made their first contribution in #1567
- @MatthiasZepper made their first contribution in #1584
- @g-pacheco made their first contribution in #1587
- @paulstretenowich made their first contribution in #1605
- @yanick made their first contribution in #1595
- @MillironX made their first contribution in #1594
- @sguizard made their first contribution in #1593
- @maleasy made their first contribution in #1552
- @TomaszSuchan made their first contribution in #1271
- @massiddamt made their first contribution in #1021
✨ MultiQC - new features
- Added option to customise default plot height in plot config (#1432)
- Added
--no-report
flag to skip report generation (#1462) - Added support for priting tool DOI in report sections (#1177)
- Added support for
--custom-css-file
/config.custom_css_files
option to include custom CSS in the final report (#1573) - New plot config option
labelSize
to customise font size for axis labels in flat MatPlotLib charts (#1576) - Added support for customising table column names (#1255)
🔨 MultiQC - updates
- MultiQC now skips modules for which no files were found - gives a small performance boost (#1463)
- Improvements for running MultiQC in a Python environment, such as a Jupyter Notebook or script
- Added commonly missing functions to several modules (#1468)
- Wrote new script to check for the above function calls that should be in every module (
.github/workflows/code_checks.py
), runs on GitHub actions CI - Make table Conditional Formatting work at table level as well as column level. (#761)
- CSS Improvements to make printed reports more attractive / readable (#1579)
- Fixed a problem with numeric filenames (#1606)
- Fixed nasty bug where line charts with a categorical x-axis would take categories from the last sample only (#1568)
- Ignore any files called
multiqc_data.json
(#1598) - Check that the config
path_filters
is a list, convert to list if a string is supplied (#1539)
🎁 New Modules
- CheckQC
- A program designed to check a set of quality criteria against an Illumina runfolder
- pbmarkdup
- Mark duplicate reads from PacBio sequencing of an amplified library
- WhatsHap
- WhatsHap is a software for phasing genomic variants using DNA sequencing reads
🌟 Module feature additions
- BBMap
- Added handling for
qchist
output (#1021)
- Added handling for
- bcftools
- Added a plot with samplewise number of sites, Ts/Tv, number of singletons and sequencing depth (#1087)
- Mosdepth
- Added mean coverage #1566
- NanoStat
- Recognize FASTA and FastQ report flavors (#1547)
🐛 Module updates
- BBMap
- Correctly handle adapter stats files with additional columns (#1556)
- bclconvert
- Handle change in output format in v3.9.3 with new
Quality_Metrics.csv
file (#1563)
- Handle change in output format in v3.9.3 with new
- bowtie
- Minor update to handle new log wording in bowtie v1.3.0 (#1615)
- CCS
- Custom content
- DRAGEN
- Fixed bug in sample name regular expression (#1537)
- Fastp
- Fixed % pass filter statistics (#1574)
- FastQC
- goleft/indexcov
- Fix
ZeroDivisionError
if no bins are found (#1586)
- Fix
- HiCPro
- Better handling of errors when expected data keys are not found (#1366)
- Lima
- Move samples that have been renamed using
--replace-names
into the General Statistics table (#1483)
- Move samples that have been renamed using
- miRTrace
- Replace hardcoded RGB colours with Hex to avoid errors with newer versions of matplotlib (#1263)
- Mosdepth
- Fixed issue #1568
- Fixed a bug when reporting per contig coverage
- Picard
- Update
ExtractIlluminaBarcodes
to recognise more log patterns in newer versions of Picard (#1611)
- Update
- Qualimap
- Fix
ZeroDivisionError
inQM_RNASeq
and skip genomic origins plot if no aligned reads are found (#1492)
- Fix
- QUAST
- Clarify general statistics table header for length
- RSeQC
- Sambamba
- Fixed issue with a change in the format of output from
sambamba markdup
0.8.1 (#1617)
- Fixed issue with a change in the format of output from
- Skewer
- Fix
ZeroDivisionError
if no available reads are found (#1622)
- Fix
- Somalier
- Plot scaled X depth instead of mean for Sex plot (#1546)
- VEP
- Handle table cells containing
-
instead of numbers (#1597)
- Handle table cells containing
MultiQC Version 1.11
A summer release for MultiQC 🥳 🏖️ Many thanks to everyone who has contributed!
MultiQC new features
- New interactive slider controls for controlling heatmap colour scales (#1427)
- Added new
--replace-names
/ configsample_names_replace
option to replace sample names during report generation - Added
use_filename_as_sample_name
config option /--fn_as_s_name
command line flag (#949, #890, #864, #998, #1390)- Forces modules to use the log filename for the sample identifier, even if the module usually takes this from the file contents
- Required a change to the
clean_s_name()
function arguments. All core MultiQC modules updated to reflect this. - Should be backwards compatible for custom modules. To adopt new behaviour, supply
f
instead off["root"]
as the second argument. - See the documenation for details: Using log filenames as sample names and Custom sample names.
MultiQC updates
- Make the module crash tracebacks much prettier using
rich
- Refine the cli log output a little (nicely formatted header line + drop the
[INFO]
) - Added docs describing tools for downstream analysis of MultiQC outputs.
- Added CI tests for Python 3.9, pinned
networkx
package to>=2.5.1
(#1413) - Added patterns to
config.fn_ignore_paths
to avoid error with parsing installation dir / singularity cache (#1416) - Print a log message when flat-image plots are used due to sample size surpassing
plots_flat_numseries
config (#1254) - Fix the
mqc_colours
util function to lighten colours even when passing categorical or single-length lists. - Bugfix for Custom Content, using YAML configuration (eg. section headers) for images should now work
New Modules
- BclConvert
- Tool that converts / demultiplexes Illumina Binary Base Call (BCL) files to FASTQ files
- Bustools
- Tools for working with BUS files
- ccs
- Generate highly accurate single-molecule consensus reads from PacBio data
- GffCompare
- GffCompare can annotate and estimate accuracy of one or more GFF files compared with a reference annotation.
- Lima
- The PacBio Barcode Demultiplexer
- NanoStat
- Calculate various statistics from a long read sequencing datasets
- ODGI
- Optimized dynamic genome/graph implementation
- Pangolin
- Added MultiQC support for Pangolin, the tool that determines SARS-CoV-2 lineages
- Sambamba Markdup
- Added MultiQC module to add duplicate rate calculated by Sambamba Markdup.
- Snippy
- Rapid haploid variant calling and core genome alignment.
- VEP
- Added MultiQC module to add summary statistics of Ensembl VEP annotations.
- Handle error from missing variants in VEP stats file. (#1446)
Module feature additions
- Cutadapt
- Added support for linked adapters #1329]
- Parse whether trimming was 5' or 3' for Lengths of Trimmed Sequences plot where possible
- Mosdepth
- Include or exclude contigs based on patterns for coverage-per-contig plots
- Picard
- Add support for
CollectIlluminaBasecallingMetrics
,CollectIlluminaLaneMetrics
,ExtractIlluminaBarcodes
andMarkIlluminaAdapters
(#1336) - New
insertsize_xmax
configuration option to limit the plotted maximum insert size forInsertSizeMetrics
- Add support for
- Qualimap
- Added new percentage coverage plot in
QM_RNASeq
(#1258)
- Added new percentage coverage plot in
- RSeQC
Module updates
- biscuit
- Duplicate Rate and Cytosine Retention tables are now bargraphs.
- Refactor code to only calculate alignment statistics once.
- Fixed bug where cytosine retentions values would not be properly read if in scientific notation.
- bcl2fastq
- Added sample name cleaning so that prepending directories with the
-d
flag works properly.
- Added sample name cleaning so that prepending directories with the
- Cutadapt
- Dragen
- Handled MultiQC crashing when run on single-end output from Dragen (#1374)
- fastp
- Handle a
ZeroDivisionError
if there are zero reads (#1444)
- Handle a
- FastQC
- Added check for if
overrepresented_sequences
is missing from reports (#1281)
- Added check for if
- Flexbar
- Fixed bug where reports with 0 reads would crash MultiQC (#1407)
- Kraken
- Mosdepth
- Show barplot instead of line graph for coverage-per-contig plot if there is only one contig.
- Picard
RnaSeqMetrics
- fix assignment barplot labels to say bases instead of reads (#1408)CrosscheckFingerprints
- fix bug where LOD threshold was not detected when invoked with "new" picard cli style. fixed formatting bug (#1414)- Made checker for comma as decimal separator in
HsMetrics
more robust (#1296)
- qc3C
- Updated module to not fail on older field names.
- Qualimap
- Fixed wrong units in tool tip label (#1258)
- QUAST
- Fixed typo causing wrong number of contigs being displayed (#1442)
- Sentieon
- Handled
ZeroDivisionError
when input files have zero reads (#1420)
- Handled
- RSEM
- Handled
ZeroDivisionError
when input files have zero reads (#1040)
- Handled
- RSeQC
- Fixed double counting of some categories in
read_distribution
bar graph. (#1457)
- Fixed double counting of some categories in
MultiQC Version 1.10.1
This is a relatively small release focussing just on bug fixes 🐛 - the last release revealed a couple of nasty ones which I felt were bad enough to justify a fast patch.. (99 bugs - fix 1, now you've got 103 bugs)
Many thanks to everyone who reports these problems along with example data 🕵🏻
MultiQC updates
- Allow scientific notation numbers in colour scheme generation
- Fixed bug with very small minimum numbers that only revelead itself after a bugfix done in the v1.10 release
- Require at least
rich
version9.4.0
to avoidSpinnerColumn
AttributeError
(#1393) - Dropped the
Skipping search pattern
log message from a warning to debug - Moved directory prepending with
-d
back to before sample name cleaning (as it was before v1.7) (#1264) - If linegraph plot data goes above
ymax
, only discard the data if the line doesn't come back again (#1257) - Allow
top_modules
to be specified as empty dicts (#1274) - Properly ignore
.snakemake
folders as intended (#1395)
Module updates
- bcftools
- Fixed bug where
QUAL
value.
would crash MultiQC (#1400)
- Fixed bug where
- bowtie2
- Fix bug where HiSAT2 paired-end bar plots were missing unaligned reads (#1230)
- Deeptools
- FastQC
- Replace
NaN
with0
in the Per Base Sequence Content plot to avoid crashing the plot (#1246)
- Replace
- Picard
- Fixed bug in
ValidateSamFile
module where additional whitespace at the end of the file would cause MultiQC to crash (#1397)
- Fixed bug in
- Somalier
- Fixed bug where using sample name cleaning in a config would trigger a
KeyError
(#1234)
- Fixed bug where using sample name cleaning in a config would trigger a
MultiQC Version 1.10
Many thanks to everyone's patience in waiting for this release, it is much appreciated!
Update for developers: Code linting
This is a big change for MultiQC developers. I have added automated code formatting and code linting (style checks) to MultiQC. This helps to keep the MultiQC code base consistent despite having many contributors and helps me to review pull-requests without having to consider whitespace. Specifically, MultiQC now uses three main tools:
- Black - Python Code
- Prettier - Everything else (almost)
- markdownlint-cli - Stricter markdown rules
All developers must run these tools when submitting changes via Pull-Requests! Automated CI tests now run with GitHub actions to check that all files pass the above tests. If any files do not, that test will fail giving a red ❌ next to the pull request.
For further information, please see the documentation.
MultiQC updates
New MultiQC Features
--sample-filters
now also acceptsshow_re
andhide_re
in addition toshow
andhide
. The_re
options use regex, while the "normal" options use globbing.- MultiQC config files now work with
.yml
file extension as well as.yaml
.yaml
will take preference if both found.
- Section comments can now also be added for General Statistics
section_comments: { general_stats: "My comment" }
- New table header config option
bgcols
allows background colours for table cells with categorical data. - New table header config options
cond_formatting_rules
andcond_formatting_colours
- Comparable functionality to user config options
table_cond_formatting_rules
andtable_cond_formatting_colours
,
allowes module developers to format table cell values as labels.
- Comparable functionality to user config options
- New CI test looks for git merge markers in files
- Beautiful new progress bar from the amazing willmcgugan/rich package.
- Added a bunch of new default sample name trimming suffixes (see
8ac5c7b
) - Added
timeout-minutes: 10
to the CI test workflow to check that changes aren't negatively affecting run time too much. - New table header option
bars_zero_centrepoint
to treat0
as zero width bars and plot bar length based on absolute values
New Modules
- EigenStratDatabaseTools
- Added MultiQC module to report SNP coverages from
eigenstrat_snp_coverage.py
in the general stats table.
- Added MultiQC module to report SNP coverages from
- HOPS
- Post-alignment ancient DNA analysis tool for MALT
- JCVI
- Computes statistics on genome annotation.
- ngsderive
- Forensic analysis tool useful in backwards computing information from next-generation sequencing data.
- OptiType
- Precision HLA typing from next-generation sequencing data
- PURPLE
- A purity, ploidy and copy number estimator for whole genome tumor data
- Pychopper
- Identify, orient and trim full length Nanopore cDNA reads
- qc3C
- Reference-free QC of Hi-C sequencing data
- Sentieon
- Submodules added to catch Picard-based QC metrics files
Module updates
- DRAGEN
- featureCounts
- fgbio
- Fix
ErrorRateByReadPosition
to calculateymax
not just on the overallerror_rate
, but also specific base errors (ex.a_to_c_error_rate
,a_to_g_error_rate
, ...). (#1215) - Fix
ErrorRateByReadPosition
plotted line names to no longer concatenate multiple read identifiers and no longer have off-by-one read numbering (ex.Sample1_R2_R3
->Sample1_R2
) ([#1304)
- Fix
- Fastp
- Fixed description for duplication rate (pre-filtering, not post) ([#1313)
- GATK
- Add support for the creation of a "Reported vs Empirical Quality" graph to the Base Recalibration module.
- hap.py
- Updated module to plot both SNP and INDEL stats (#1241)
- indexcov
- Fixed bug when making the PED file plots (#1265)
- interop
- Added the
% Occupied
metric toRead Metrics per Lane
table which is reported for NovaSeq and iSeq platforms.
- Added the
- Kaiju
- Kraken
- MALT
- Fix y-axis labelling in bargraphs
- MACS2
- Add number of peaks to the General Statistics table.
- mosdepth
- Enable prepending of directory to sample names
- Display contig names in Coverage per contig plot tooltip
- Picard
- Fix
HsMetrics
bait percentage columns (#1212) - Fix
ConvertSequencingArtifactToOxoG
files not being found (#1310) - Make
WgsMetrics
histogram smoothed if more than 1000 data points (avoids huge plots that crash the browser) - Multiple new config options for
WgsMetrics
to customise coverage histogram and speed up MultiQC with very high coverage files. - Add additional datasets to Picard Alignment Summary (#1293)
- Add support for
CrosscheckFingerprints
(#1327)
- Fix
- PycoQC
- Log10 x-axis for Read Length plot (#1214)
- Rockhopper
- Fix issue with parsing genome names in Rockhopper summary files (#1333)
- Fix issue properly parsing multiple samples within a single Rockhopper summary file
- Salmon
- Only try to generate a plot for fragment length if the data was found.
- verifyBamID
- Fix
CHIP
value detection (#1316).
- Fix
New Custom Content features
- General Stats custom content now gives a log message
- If
id
is not set inJSON
orYAML
files, it defaults to the sample name instead of justcustom_content
- Data from
JSON
orYAML
now hasdata
keys (sample names) run through theclean_s_name()
function to apply sample name cleanup - Fixed minor bug which caused custom content YAML files with a string
data
type to not be parsed
Bug Fixes
- Disable preservation of timestamps / modes when copying temp report files, to help issues with network shares (#1333)
- Fixed MatPlotLib warning:
FixedFormatter should only be used together with FixedLocator
- Fixed long-standing min/max bug with shared minimum values for table columns using
shared_key
- Made table colour schemes work with negative numbers (don't strip
-
from values when making scheme)
MultiQC Version 1.9
Another massive release - many thanks to all of the contributors! Keep those pull-requests and issues coming!
Dropped official support for Python 2
Python 2 had its official sunset date
on January 1st 2020, meaning that it will no longer be developed by the Python community.
Part of the python.org statement reads:
That means that we will not improve it anymore after that day,
even if someone finds a security problem in it.
You should upgrade to Python 3 as soon as you can.
Very many Python packages no longer support Python 2
and it whilst the MultiQC code is currently compatible with both Python 2 and Python 3,
it is increasingly difficult to maintain compatibility with the dependency packages it
uses, such as MatPlotLib, numpy and more.
As of MultiQC version 1.9, Python 2 is no longer officially supported.
Automatic CI tests will no longer run with Python 2 and Python 2 specific workarounds
are no longer guaranteed.
Whilst it may be possible to continue using MultiQC with Python 2 for a short time by
pinning dependencies, MultiQC compatibility for Python 2 will now slowly drift and start
to break. If you haven't already, you need to switch to Python 3 now.
New MultiQC Features
- Now using GitHub Actions for all CI testing
- Dropped Travis and AppVeyor, everything is now just on GitHub
- Still testing on both Linux and Windows, with multiple versions of Python
- CI tests should now run automatically for anyone who forks the MultiQC repository
- Linting with
--lint
now checks line graphs as well as bar graphs - New
gathered
template with no tool name sections (#1119) - Added
--sample-filters
option to add show/hide buttons at the top of the report (#1125)- Buttons control the report toolbox Show/Hide tool, filtering your samples
- Allows reports to be pre-configured based on a supplied list of sample names at report-generation time.
- Line graphs can now have
Log10
buttons (same functionality as bar graphs) - Importing and running
multiqc
in a script is now a little Bettermultiqc.run
now returns thereport
andconfig
as well as the exit code. This means that you can explore the MultiQC run time a little in the Python environment.- Much more refactoring is needed to make MultiQC as useful in Python scripts as it could be. Watch this space.
- If a custom module
anchor
is set usingmodule_order
, it's now used a bit more:- Prefixed to module section IDs
- Appended to files saved in
multiqc_data
- Should help to prevent duplicates requiring
-1
suffixes when running a module multiple times
- New heatmap plot config options
xcats_samples
andycats_samples
- If set to
False
, the report toolbox options (highlight, rename, show/hide) do not affect that axis. - Means that the Show only matching samples report toolbox option works on FastQC Status Checks, for example (#1172)
- If set to
- Report header time and analysis paths can now be hidden
- New config options
show_analysis_paths
andshow_analysis_time
(#1113)
- New config options
- New search pattern key
skip: true
to skip specific searches when modules look for a lot of different files (eg. Picard). - New
--profile-runtime
command line option (config.profile_runtime
) to give analysis of how long the report takes to be generated- Plots of the file search results and durations are added to the end of the MultiQC report as a special module called Run Time
- A summary of the time taken for the major stages of MultiQC execution are printed to the command line log.
- New table config option
only_defined_headers
- Defaults to
true
, set tofalse
to also show any data columns that are not defined as headers - Useful as allows table-wide defaults to be set with column-specific overrides
- Defaults to
- New
module
key allowed forconfig.extra_fn_clean_exts
andconfig.fn_clean_exts
- Means you can limit the action of a sample name cleaning pattern to specific MultiQC modules (#905)
New Custom Content features
- Improve support for HTML files - now just end your HTML filename with
_mqc.html
- Native handling of HTML snippets as files, no MultiQC config or YAML file required.
- Also with embedded custom content configuration at the start of the file as a HTML comment.
- Add ability to group custom-content files into report sections
- Use the new
parent_id
,parent_name
andparent_description
config keys to group content together like a regular module (#1008)
- Use the new
- Custom Content files can now be configured using
custom_data
, without giving search patterns or data
New Modules:
- DRAGEN
- Illumina Bio-IT Platform that uses FPGA for secondary NGS analysis
- iVar
- Added support for iVar: a computational package that contains functions broadly useful for viral amplicon-based sequencing.
- Kaiju
- Fast and sensitive taxonomic classification for metagenomics
- Kraken
- K-mer matching tool for taxonomic classification. Module plots bargraph of counts for top-5 hits across each taxa rank. General stats summary.
- MALT
- Megan Alignment Tool: Metagenomics alignment tool.
- miRTop
- Command line tool to annotate miRNAs with a standard mirna/isomir naming (mirGFF3)
- Module started by @oneillkza and completed by @FlorianThibord
- MultiVCFAnalyzer
- Combining multiple VCF files into one coherent report and format for downstream analysis.
- Picard - new submodules for
QualityByCycleMetrics
,QualityScoreDistributionMetrics
&QualityYieldMetrics
- See #1116
- Rockhopper
- RNA-seq tool for bacteria, includes bar plot showing where features map.
- Sickle
- A windowed adaptive trimming tool for FASTQ files using quality
- Somalier
- Relatedness checking and QC for BAM/CRAM/VCF for cancer, DNA, BS-Seq, exome, etc.
- VarScan2
- Variant calling and somatic mutation/CNV detection for next-generation sequencing data
Module updates:
- BISCUIT
- Major rewrite to work with new BISCUIT QC script (BISCUIT
v0.3.16+
)- This change breaks backwards-compatability with previous BISCUIT versions. If you are unable to upgrade BISCUIT, please use MultiQC v1.8.
- Fixed error when missing data in log files (#1101)
- Major rewrite to work with new BISCUIT QC script (BISCUIT
- bcl2fastq
- Samples with multiple library preps (i.e barcodes) will now be handled correctly (#1094)
- BUSCO
- Updated log search pattern to match new format in v4 with auto-lineage detection option (#1163)
- Cutadapt
- New bar plot showing the proportion of reads filtered out for different criteria (eg. too short, too many Ns) (#1198)
- DamageProfiler
- Removes redundant typo in init name. This makes referring to the module's column consistent with other modules when customising general stats table.
- DeDup
- Updates plots to make compatible with 0.12.6
- Fixes reporting errors - barplot total represents mapped reads, not total reads in BAM file
- New: Adds 'Post-DeDup Mapped Reads' column to general stats table.
- FastQC
- FastQ Screen
- fgbio
- New: Plot error rate by read position from
ErrorRateByReadPosition
- GroupReadsByUmi plot can now be toggled to show relative percents (#1147)
- New: Plot error rate by read position from
- FLASh
- Logs not reporting innie and outine uncombined pairs now plot combined pairs instead (#1173)
- GATK
- Made ...
MultiQC Version 1.8
A huge release, this one has been a long time coming. Due to @ewels being away on paternity leave for over six months it was very delayed and has been nearly a year in the making! During that time there has been 344
commits with 3,370
lines of code added and 1,194
deletions by 19
contributors. That's a lot of changes.
Highlights include:
- Finally removing the annoying YAML warning
- Six new modules, and many large updates to existing modules
- Code restructuring allowing MultiQC to be imported into Python environments and easier running on Windows
- Lots of tiny bug fixes all over the place.
Enjoy the update! And I promise I'll try not to make everyone wait so long for the next release...
Full changelog
New Modules:
- fgbio
- Process family size count hist data from GroupReadsByUmi
- biobambam2
- Added submodule for
bamsormadup
tool - Totally cheating - it uses Picard MarkDuplicates but with a custom search pattern and naming
- Added submodule for
- SeqyClean
- Adds analysis for seqyclean files
- mtnucratio
- Added little helper tool to compute mt to nuclear ratios for NGS data.
- mosdepth
- fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
- SexDetErrmine
- Relative coverage and error rate of X and Y chromosomes
Module updates:
- bcl2fastq
- Added handling of demultiplexing of more than 2 reads
- Allow bcl2fastq to parse undetermined barcode information in situations when lane indexes do not start at 1
- BBMap
- Support for scafstats output marked as not yet implemented in docs
- DeDup
- Added handling clusterfactor and JSON logfiles
- damageprofiler
- Added writing metrics to data output file.
- DeepTools
- fastp
- Fix faulty column handling for the after filtering Q30 rate (#936)
- FastQC
- When including a FastQC section multiple times in one report, the Per Base Sequence Content heatmaps now behave as you would expect.
- Added heatmap showing FastQC status checks for every section report across all samples
- Made sequence content individual plots work after samples have been renamed (#777)
- Highlighting samples from status - respect chosen highlight colour in the toolbox (#742)
- FastQ Screen
- When including a FastQ Screen section multiple times in one report, the plots now behave as you would expect.
- GATK
- Refactored BaseRecalibrator code to be more consistent with MultiQC Python style
- Handle zero count errors in BaseRecalibrator
- HiC Explorer
- Fixed bug where module tries to parse QC_table.txt, a new log file in hicexplorer v2.2.
- HTSeq
- Fixed bug where module would crash if a sample had zero reads (#1006)
- LongRanger
- Added support for the LongRanger Align pipeline.
- miRTrace
- Fixed bug where a sample in some plots was missed. (#932)
- Peddy
- Fixed bug where sample name cleaning could lead to error. (#1024)
- All plots (including Het Check and Sex Check) now hidden if no data
- Picard
- Modified OxoGMetrics.py so that it will find files created with GATK CollectMultipleMetrics and ConvertSequencingArtifactToOxoG.
- QoRTs
- Fixed bug where
--dirs
broke certain input files. (#821)
- Fixed bug where
- Qualimap
- Added in mean coverage computation for general statistics report
- Creates now tables of collected data in
multiqc_data
- RNA-SeQC
- Updated broken URL link
- RSeQC
- Fixed bug where Junction Saturation plot when clicking a single sample was mislabelling the lines.
- When including a RSeQC section multiple times in one report, clicking Junction Saturation plot now behaves as you would expect.
- Fixed bug where exported data in
multiqc_rseqc_read_distribution.txt
files had incorrect values for_kb
fields (#1017)
- Samtools
- Utilize in-built
read_count_multiplier
functionality to plotflagstat
results more nicely
- Utilize in-built
- SnpEff
- Increased the default summary csv file-size limit from 1MB to 5MB.
- Stacks
- Fixed bug where multi-population sum stats are parsed correctly (#906)
- TopHat
- Fixed bug where TopHat would try to run with files from Bowtie2 or HiSAT2 and crash
- VCFTools
- Fixed a bug where
tstv_by_qual.py
produced invalid json from infinity-values.
- Fixed a bug where
- snpEff
- Added plot of effects
New MultiQC Features:
- Added some installation docs for windows
- Added some docs about using MultiQC in bioinformatics pipelines
- Rewrote Docker image
- New base image
czentye/matplotlib-minimal
reduces image size from ~200MB to ~80MB - Proper installation method ensures latest version of the code
- New entrypoint allows easier command-line usage
- New base image
- Support opening MultiQC on websites with CSP
script-src 'self'
with some sha256 exceptions- Plot data is no longer intertwined with javascript code so hashes stay the same
- Made
config.report_section_order
work for module sub-sections as well as just modules. - New config options
exclude_modules
andrun_modules
to complement-e
and-m
cli flags. - Command line output is now coloured by default 🌈 (use
--no-ansi
to turn this off) - Better launch comparability due to code refactoring by @KerstenBreuer and @ewels
- Windows support for base
multiqc
command - Support for running as a python module:
python -m multiqc .
- Support for running within a script:
import multiqc
andmultiqc.run('/path/to/files')
- Windows support for base
- Config option
custom_plot_config
now works for bargraph category configs as well (#1044) - Config
table_columns_visible
can now be given a module namespace and it will hide all columns from that module (#541)
Bug Fixes:
- MultiQC now ignores all
.md5
files - Use
SafeLoader
for PyYaml load calls, avoiding recent warning messages. - Hide
multiqc_config_example.yaml
in thetest
directory to stop people from using it without modification. - Fixed matplotlib background colour issue (@epakarin - #886)
- Table rows that are empty due to hidden columns are now properly hidden on page load (#835)
- Sample name cleaning: All sample names are now truncated to their basename, without a path.
- This includes for
regex
andreplace
(before was only the defaulttruncate
). - Only affects modules that take sample names from file contents, such as cutadapt.
- See #897 for discussion.
- This includes for
MultiQC Version 1.7
An early Christmas present for MultiQC users! 🎅🎁🎄
Many thanks to everyone who has contributed to this release. Happy Christmas and a very happy new year!
New Modules:
- BISCUIT
- BISuilfite-seq CUI Toolkit
- Module written by @zwdzwd
- DamageProfiler
- A tool to determine ancient DNA misincorporation rates.
- Module written by @apeltzer
- FLASh
- FLASH (Fast Length Adjustment of SHort reads)
- Module written by @pooranis
- MinIONQC
- QC of reads from ONT long-read sequencing
- Module written by @ManavalanG
- phantompeakqualtools
- A tool for informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data.
- Module written by @chuan-wang
- Stacks
- A software for analyzing restriction enzyme-based data (e.g. RAD-seq). Support for Stacks >= 2.1 only.
- Module written by @remiolsen
Module updates:
- AdapterRemoval
- Handle error when zero bases are trimmed. See #838.
- Bcl2fastq
- New plot showing the top twenty of undetermined barcodes by lane.
- Informations for R1/R2 are now separated in the General Statistics table.
- SampleID is concatenate with SampleName because in Chromium experiments several sample have the same SampleName.
- deepTools
- New PCA plots from the
plotPCA
function (written by @chuan-wang) - New fragment size distribution plots from
bamPEFragmentSize --outRawFragmentLengths
(written by @chuan-wang) - New correlation heatmaps from the
plotCorrelation
function (written by @chuan-wang) - New sequence distribution profiles around genes, from the
plotProfile
function (written by @chuan-wang) - Reordered sections
- New PCA plots from the
- Fastp
- Fixed bug in parsing of empty histogram data. See #845.
- FastQC
- Refactored Per Base Sequence Content plots to show original underlying data, instead of calculating it from the page contents. Now shows original FastQC base-ranges and fixes 100% GC bug in final few pixels. See #812.
- When including a FastQC section multiple times in one report, the summary progress bars now behave as you would expect.
- FastQ Screen
- Don't hide genomes in the simple plot, even if they have zero unique hits. See #829.
- InterOp
- Fixed bug where read counts and base pair yields were not displaying in tables correctly.
- Number formatting for these fields can now be customised in the same way as with other modules, as described in the docs
- Picard
- InsertSizeMetrics: You can now configure to what degree the insert size plot should be smoothed.
- CollectRnaSeqMetrics: Add warning about missing rRNA annotation.
- CollectRnaSeqMetrics: Add chart for counts/percentage of reads mapped to the correct strand.
- Now parses VariantCallingMetrics reports. (Similar to GATK module's VariantEval.)
- phantompeakqualtools
- Properly clean sample names
- Trimmomatic
- Updated Trimmomatic module documentation to be more helpful
- New option to use filenames instead of relying on the command line used. See #864.
New MultiQC Features:
- Embed your custom images with a new Custom Content feature! Just add
_mqc
to the end of the filename for.png
,.jpg
or.jpeg
files. - Documentation for Custom Content reordered to make it a little more sane
- You can now add or override any config parameter for any MultiQC plot! See the documentation for more info.
- Allow
table_columns_placement
config to work with table IDs as well as column namespaces. See #841. - Improved visual spacing between grouped bar plots
Bug Fixes:
- Custom content no longer clobbers
col1_header
table configs - The option
--file-list
that refers to a text file with file paths to analyse will no longer ignore directory paths - Sample name directory prefixes are now added after cleanup.
- If a module is run multiple times in one report, it's CSS and JS files will only be included once (
default
template)
MultiQC Version 1.6
Some of these updates are thanks to the efforts of people who attended the NASPM 2018 MultiQC hackathon session. Thanks to everyone who attended!
New Modules:
- fastp
- An ultra-fast all-in-one FASTQ preprocessor (QC, adapters, trimming, filtering, splitting...)
- Module started by @florianduclot and completed by @ewels
- hap.py
- Hap.py is a set of programs based on htslib to benchmark variant calls against gold standard truth datasets
- Module written by @tsnowlan
- Long Ranger
- Works with data from the 10X Genomics Chromium. Performs sample demultiplexing, barcode processing, alignment, quality control, variant calling, phasing, and structural variant calling.
- Module written by @remiolsen
- miRTrace
- A quality control software for small RNA sequencing data.
- Module written by @chuan-wang
Module updates:
- BCFtools
- New plot showing SNP statistics versus quality of call from bcftools stats (@MaxUlysse and @Rotholandus)
- BBMap
- Support added for BBDuk kmer-based adapter/contaminant filtering summary stats (@boulund
- FastQC
- New read count plot, split into unique and duplicate reads if possible.
- Help text added for all sections, mostly copied from the excellent FastQC help.
- Sequence duplication plot rescaled
- FastQ Screen
- Samples in large-sample-number plot are now sorted alphabetically (@hassanfa
- MACS2
- Output is now more tolerant of missing data (no plot if no data)
- Peddy
- Picard
- New submodule to handle
ValidateSamFile
reports (@cpavanrun) - WGSMetrics now add the mean and standard-deviation coverage to the general stats table (hidden) (@cpavanrun)
- New submodule to handle
- Preseq
- New config option to plot preseq plots with unique old coverage on the y axis instead of read count
- Code refactoring by @vladsaveliev
- QUAST
- Null values (
-
) in reports now handled properly. Bargraphs always shown despite varying thresholds. (@vladsaveliev)
- Null values (
- RNA-SeQC
- Don't create the report section for Gene Body Coverage if no data is given
- Samtools
- Fixed edge case bug where MultiQC could crash if a sample had zero count coverage with idxstats.
- Adds % proper pairs to general stats table
- Skewer
- Read length plot rescaled
- Tophat
- Fixed bug where some samples could be given a blank sample name (@lparsons)
- VerifyBamID
- Change column header help text for contamination to match percentage output (@chapmanb)
New MultiQC Features:
- New config option
remove_sections
to skip specific report sections from modules - Add
path_filters_exclude
to exclude certain files when running modules multiple times. You could previously only include certain files. - New
exclude_*
keys for file search patterns- Have a subset of patterns to exclude otherwise detected files with, by filename or contents
- Command line options all now use mid-word hyphens (not a mix of hyphens and underscores)
- Old underscore terms still maintained for backwards compatibility
- Flag
--view-tags
now works without requiring an "analysis directory". - Removed Python dependency for
enum34
(@boulund) - Columns can be added to
General Stats
table for custom content/module. - New
--ignore-symlinks
flag which will ignore symlinked directories and files. - New
--no-megaqc-upload
flag which disables automatically uploading data to MegaQC
Bug Fixes
- Fix path_filters for top_modules/module_order configuration only selecting if all globs match. It now filters searches that match any glob.
- Empty sample names from cleaning are now no longer allowed
- Stop prepend_dirs set in the config from getting clobbered by an unpassed CLI option (@tsnowlan)
- Modules running multiple times now have multiple sets of columns in the General Statistics table again, instead of overwriting one another.
- Prevent tables from clobbering sorted row orders.
- Fix linegraph and scatter plots data conversion (sporadically the incorrect
ymax
was used to drop data points) (@cpavanrun) - Adjusted behavior of ceiling and floor axis limits
- Adjusted multiple file search patterns to make them more specific
- Prevents the wrong module from accidentally slurping up output from a different tool. By @cpavanrun (see PR #727)
- Fixed broken report bar plots when
-p
/--export-plots
was specified (see issue #801)
MultiQC Version 1.5
New Modules:
- DeDup - New module!
- DeDup: Improved Duplicate Removal for merged/collapsed reads in ancient DNA analysis
- Module written by @apeltzer,
- Clip&Merge - New module!
- Clip&Merge: Adapter clipping and read merging for ancient DNA analysis
- Module written by @apeltzer,
Module updates:
- bcl2fastq
- BUSCO
- Fixed configuration bug that made all sample names become
'short'
- Fixed configuration bug that made all sample names become
- Custom Content
- Parsed tables now exported to
multiqc_data
files
- Parsed tables now exported to
- Cutadapt
- Refactor parsing code to collect all length trimming plots
- FastQC
- Fixed starting y-axis label for GC-content lineplot being incorrect.
- HiCExplorer
- Updated to work with v2.0 release.
- Homer
- Made parsing of
tagInfo.txt
file more resilient to variations in file format so that it works with new versions of Homer. - Kept order of chromosomes in coverage plot consistent.
- Made parsing of
- Peddy
- Switch
Sex error
logic toCorrect sex
for better highlighting (@aledj2)
- Switch
- Picard
- Updated module and search patterns to recognise new output format from Picard version >= 2.16 and GATK output.
- Qualimap BamQC
- Fixed bug where start of Genome Fraction could have a step if target is 100% covered.
- RNA-SeQC
- Added rRNA alignment stats to summary table @Rolandde
- RSeqC
- Fixed read distribution plot by adding category for
other_intergenic
(thanks to @moxgreen) - Fixed a dodgy plot title (Read GC content)
- Fixed read distribution plot by adding category for
- Supernova
- Added support for Supernova 2.0 reports. Fixed a TypeError bug when using txt reports only. Also a bug when parsing empty histogram files.
New MultiQC Features:
- Invalid choices for
--module
or--exclude
now list the available modules alphabetically. - Linting now checks for presence in
config.module_order
and tags.
Bug Fixes
- Excluding modules now works in combination with using module tags.
- Fixed edge-case bug where certain combinations of
output_fn_name
anddata_dir_name
could trigger a crash - Conditional formatting - values are now longer double-labelled
- Made config option
extra_series
work in scatter plots the same way that it works for line plots - Locked the
matplotlib
version tov2.1.0
and below