Release v0.1.8 · zktuong/dandelion

Much required speed upgrade for the following newly added functions to be fully usable:
- Refactored filter_contigs/FilterContigs/FilterContigLite. Solves #92
  - Reworked into a tree format where it iterates the rows to form cells (~1k iter/s), and then iterate through the cells (~150 iter/s) compared to previous 3-4 iter/s.
  - Adjusted small bug where the duplicate_counts were not adding up
- Refactored Query.
  - Now does the __init__ method preloads the required fields as a tree and a separate retrieve method and access the dictionaries much faster. Same method as above.
- read_10x_vdj
  - Refactored parse_annotation which was slowing everything down.
  - Similar method to above
- sanitize_data
  - Use airr.RearrangementSchema to match the dtype to deal with missing values. Also speed up some steps to make the validation faster.
  - Also bug fix causing float columns to be unintentially converted to integers e.g. mu_freq columns should hopefully now return properly.
Added write_airr function that basically calls airr.create_rearrangement
Adjusted quantify_mutations to hopefully return the right dtypes now.
Added option to filter_contigs to run without anndata.
Bug fix to dandelion_preprocessing.py to let it run quantify_mutations based on the args.file_prefix.

Provide feedback