Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive use + Pydantic models for plots #2442

Merged
merged 150 commits into from May 4, 2024
Merged

Conversation

vladsavelyev
Copy link
Member

@vladsavelyev vladsavelyev commented Mar 18, 2024

Use Pydantic

  • Plots, tables, plot data (Plot, DataTable, Dataset, and derived classes). Helps validate dumps for JavaScript internally.
  • Configuration in interactive functions and command line (ClConfig). Validates run parameters.

Future improvements would include validating custom content with Pydantic.

Partially addresses #1790

Support interactive usage

Support interactive usage of MultiQC, i.e. in notebooks. Partly addresses #1051

Split the multiqc.py into submodules with better separated stages: parse logs, render plots and tables, write data to disk. That allowed to add functions like multiqc.parse_logs(), that parses more logs and appends them into the same report without rendering HTML or writing stuff to disk; multiqc.list_plots() multiqc.show_plot() without rendering all plots; multiqc.write_results() to only render and write results to disk.

Config parameters can be passed individually to any interactive method (like verbose, modules_order, etc).

Can also add custom sections into the report with multiqc.add_custom_content_section().

The write_report() function triggers HTML rendering, module ordering, adds special-case software versions and runs performance modules, and writes data and report. Though all those things are also separated internally and can be exposed more granuarly if needed.

Example notebook

Check this notebook for example usage:

https://github.com/MultiQC/example-notebook/blob/master/multiqc_example.ipynb

Performance benchmark

Tested the branch against main (with all recent performance improvements merged into both). Getting consistent speed up just from the refactoring:

main
Run took 103.14 seconds

  • 5.14s: Searching files
  • 54.13s: Running modules
  • 20.40s: Compressing report data
    3628933760 peak memory footprint

interactive-use-2
Run took 77.98 seconds

  • 5.15s: Searching files
  • 29.62s: Running modules
  • 18.11s: Compressing report data
    3638501952 peak memory footprint

Before performance PRs (196c9738)
Run took 115.36 seconds

  • 11.50s: Searching files
  • 55.16s: Running modules
  • 35.53s: Compressing report data
    5196197376 peak memory footprint

@vladsavelyev vladsavelyev force-pushed the interactive-use-2 branch 2 times, most recently from 467c43b to 0a13d36 Compare March 19, 2024 16:38
@vladsavelyev vladsavelyev changed the base branch from main to refactor-module-3 March 19, 2024 16:49
@vladsavelyev vladsavelyev changed the base branch from refactor-module-3 to split-up-main March 19, 2024 17:11
@vladsavelyev vladsavelyev added the core: refactoring Code refactoring label Mar 20, 2024
@vladsavelyev vladsavelyev changed the title Functions for interactive use Pydantic + interactive use Mar 20, 2024
@vladsavelyev vladsavelyev added this to the MultiQC v1.22: Pydantic milestone Mar 20, 2024
@vladsavelyev vladsavelyev marked this pull request as ready for review May 2, 2024 19:44
@vladsavelyev vladsavelyev changed the title Pydantic + interactive use Interactive use + Pydantic models for plots May 4, 2024
@vladsavelyev vladsavelyev merged commit ccbb5c5 into main May 4, 2024
7 checks passed
@vladsavelyev vladsavelyev deleted the interactive-use-2 branch May 4, 2024 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core: refactoring Code refactoring
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant