Skip to content

v0.1.8

Compare
Choose a tag to compare
@zktuong zktuong released this 16 Aug 08:59
· 807 commits to master since this release
bbeb7fc
  • Much required speed upgrade for the following newly added functions to be fully usable:
    • Refactored filter_contigs/FilterContigs/FilterContigLite. Solves #92
      • Reworked into a tree format where it iterates the rows to form cells (~1k iter/s), and then iterate through the cells (~150 iter/s) compared to previous 3-4 iter/s.
      • Adjusted small bug where the duplicate_counts were not adding up
    • Refactored Query.
      • Now does the __init__ method preloads the required fields as a tree and a separate retrieve method and access the dictionaries much faster. Same method as above.
    • read_10x_vdj
      • Refactored parse_annotation which was slowing everything down.
      • Similar method to above
    • sanitize_data
      • Use airr.RearrangementSchema to match the dtype to deal with missing values. Also speed up some steps to make the validation faster.
      • Also bug fix causing float columns to be unintentially converted to integers e.g. mu_freq columns should hopefully now return properly.
  • Added write_airr function that basically calls airr.create_rearrangement
  • Adjusted quantify_mutations to hopefully return the right dtypes now.
  • Added option to filter_contigs to run without anndata.
  • Bug fix to dandelion_preprocessing.py to let it run quantify_mutations based on the args.file_prefix.