Releases · zktuong/dandelion

01 Feb 04:54

zktuong

v0.3.5

6998cc6

v0.3.5 Latest

Latest

What's Changed

fix container script bug by @zktuong in #350
reorder content table on docs by @zktuong in #352
fix entry of anndata with NaN 'sequence_id' values by @amoschoomy in #351
add dependabot dependency review for PR by @zktuong in #353
add else statement to check contigs when there's no sequence by @MeganS92 in #354
pip prod(deps): update pandas requirement from <=2.1.4,>=1.0.3 to >=1.0.3,<=2.2.0 by @dependabot in #355
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.25.2 to <=1.25.3 by @dependabot in #356
pip dev(deps-dev): update scirpy requirement from <=0.14 to <=0.15.0 by @dependabot in #357
convert to use umi_count by @zktuong in #358

New Contributors

@MeganS92 made their first contribution in #354

Full Changelog: v0.3.4...v0.3.5

Contributors

zktuong, dependabot, and 2 other contributors

Assets 2

10 Jan 00:17

zktuong

v0.3.4

97dc56a

v0.3.4

Summary

Speed up network generation in generate_network
Add soft filtering and normalisation to vdj_psuedobulk functions - @ktpolanski
Created a new column in .data (extra) to flag if contig is considered extra.
New clone id definition to insert VDJ and VJ to the id to reduce ambiguity - need to check if it does it properly for cells with no clone ids. This also means that now clone ids can be created for orphan chains.
New to_scirpy/from_scirpy functions that will now convert them to the new scverse airr formats - @amoschoomy
Container build is now simplified and uses mamba to manage all the dependencies.
New option to run preprocessing with ogrdb references in both the base package and the container.
New reference download function in the container folder to ensure the latest references are pulled for every new iteration of the container.
Deprecate support for python3.7 tests.

What's Changed

create a remove_malformed option by @zktuong in #315
fix future warning for multimappers by @zktuong in #316
fix cleaning up step in the singularity image by @zktuong in #317
Speed up network generation by @zktuong in #319
remove d gene VJ columns by @zktuong in #320
Running palantir by @zktuong in #321
update workflow dependencies by @zktuong in #323
Fix #325 by @zktuong in #326
add concat in tutorial by @zktuong in #327
Update 3_dandelion_findingclones-10x_data.ipynb by @tnieuwe in #328
disable 3.7 tests by @zktuong in #330
add logic to deal with contigs with no clone id during network generation step by @zktuong in #335
Soft filtering in setup_vdj_pseudobulk() by @ktpolanski in #334
attempting to reduce code bloat for find_clones by @zktuong in #329
fix bug with check contigs by @zktuong in #341
add an option to the singularity container to skip tigger step. by @zktuong in #343
add new option to check contigs by @zktuong in #344
simplify build container by @zktuong in #346
Awkward to pandas by @amoschoomy in #342
Fix yml by @zktuong in #349
fix container build definition file by @zktuong in #348

dependabot updates

pip prod(deps): update pandas requirement from <=2.1.0,>=1.0.3 to >=1.0.3,<=2.1.1 by @dependabot in #314
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.2 to <=2.2.3 by @dependabot in #318
pip prod(deps): update pandas requirement from <=2.1.1,>=1.0.3 to >=1.0.3,<=2.1.2 by @dependabot in #324
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.24.0 to <=1.24.1 by @dependabot in #331
pip prod(deps): update pandas requirement from <=2.1.2,>=1.0.3 to >=1.0.3,<=2.1.3 by @dependabot in #332
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.24.1 to <=1.25.2 by @dependabot in #333
pip dev(deps-dev): update sphinx-rtd-theme requirement from <=1.2.2 to <=2.0.0 by @dependabot in #338
pip prod(deps): update pandas requirement from <=2.1.3,>=1.0.3 to >=1.0.3,<=2.1.4 by @dependabot in #339
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.3 to <=2.2.4 by @dependabot in #340
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.4 to <=2.2.5 by @dependabot in #345
Bump tj-actions/changed-files from 35 to 41 in /.github/workflows by @dependabot in #347

New Contributors

@tnieuwe made their first contribution in #328
@amoschoomy made their first contribution in #342

Full Changelog: v0.3.3...v0.3.4

Contributors

ktpolanski, zktuong, and 3 other contributors

Assets 2

12 Sep 10:37

zktuong

v0.3.3

b0a094c

v0.3.3

What's Changed

Mainly updates and bug fixes to tl.clone_overlap and pl.clone_overlap.
simplified pre-processing functions to call command line arguments instead of running within the code.

Detailed notes:

Update docs for clone overlap by @zktuong in #276
Allow additional arguments in define_clones by @zktuong in #280
add an if statement to check if actor is dependabot by @zktuong in #289
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.23.0 to <=1.23.3 by @dependabot in #284
pip dev(deps-dev): update sphinx-rtd-theme requirement from <=1.2.0 to <=1.2.2 by @dependabot in #285
pip dev(deps-dev): update readthedocs-sphinx-ext requirement from <=2.2.0 to <=2.2.2 by @dependabot in #286
pip dev(deps-dev): update nbsphinx requirement from <=0.9.1 to <=0.9.2 by @dependabot in #287
enable auto-merge for dependabot by @zktuong in #290
refactoring how external scripts and locations are called by @zktuong in #288
fix reassign_alleles by @zktuong in #293
remove deprecated function from docs by @zktuong in #297
pip dev(deps-dev): update sphinx-autodoc-typehints requirement from <=1.23.3 to <=1.24.0 by @dependabot in #296
fix weekly tests by @zktuong in #301
pip prod(deps): update mizani requirement from <0.10.0 to <0.11.0 by @dependabot in #302
add options to plotting clone overlap by @zktuong in #307
add requirements.txt by @zktuong in #309
should be cartesian product instead of combination by @zktuong in #312

Full Changelog: v0.3.2...v0.3.3

Contributors

zktuong and dependabot

Assets 2

29 May 00:10

zktuong

v0.3.2

19eaa63

v0.3.2

What's Changed

Mainly to fix compatibility with dependencies.

minor wording/renaming tweaks in tutorial by @ktpolanski in #252
ensure additional column names are present in strict mode by @zktuong in #253
fix weekly tests by @zktuong in #255
fix singularity preprocessing for org by @zktuong in #256
Easy docs2 by @zktuong in #260
Update _network.py by @zktuong in #258
fix requirements by @zktuong in #262
Create dependabot.yml by @zktuong in #263
pip prod(deps): update pandas requirement from <1.5.0,>=1.0.3 to >=1.0.3,<2.1.0 by @dependabot in #264
getting rid of CI warnings by @zktuong in #265
update actions by @zktuong in #266
barplot bug by @zktuong in #267
fix query behaviour for merge by @zktuong in #268
Update colab doc by @zktuong in #271

New Contributors

@dependabot made their first contribution in #264

Full Changelog: v0.3.1...v0.3.2

Contributors

ktpolanski, zktuong, and dependabot

Assets 2

06 Feb 11:29

zktuong

v0.3.1

878c1a0

v0.3.1

What's Changed

Just to update pypi - Some bug fixes to accompany the revision
Doesn't affect the container image (but i should add a tag on sylabs to also call it 0.3.1 just to be consisten).

Enforce pickle priorty to be 4 by @zktuong in #216
minor doc aesthetic updates by @zktuong in #218
add citation to preprint by @zktuong in #221
Does this help? by @zktuong in #222
add missing function to api by @zktuong in #223
update-docstrings by @zktuong in #224
Minor behavior update by @zktuong in #228
fix re-indexing issue by @zktuong in #229
parse main calls by @zktuong in #231
add mouse-preprocess by @zktuong in #237
vdj mapping causing an issue? by @zktuong in #241
revert awk change by @zktuong in #242
fix macos tests by @zktuong in #243
add logic for jmultimap checking by @zktuong in #244
Revert "add logic for jmultimap checking" by @zktuong in #245
fix empty columns by @zktuong in #246
Update _core.py by @zktuong in #247
increased transparency tutorial by @ktpolanski in #248
update email by @zktuong in #249
Return gamma delta notebook by @zktuong in #250

Full Changelog: v0.3.0...v0.3.1

Contributors

ktpolanski and zktuong

Assets 2

09 Nov 14:07

zktuong

v0.3.0

2a53fc1

v0.3.0

What's Changed

This release adds a number of new features and minor restructuring to accompany Dandelion's manuscript (uploading soon). Kudos to @suochenqu and @ktpolanski

data strategy to handle non-productive contigs, partial contigs and 'J multi-mappers'
new V(D)J pseudotime trajectory inference!
revamped tutorials and documents

Detailed PRs

multimappers by @zktuong in #165
Update environment.yml by @zktuong in #168
fix-typo by @zktuong in #172
add J multimap to BCR workflow by @zktuong in #178
fix pandas dependency by @zktuong in #181
fix unreference variable 182 by @zktuong in #183
add trajectory utils by @suochenqu in #185
select left most J call in multimappers by @zktuong in #186
update container definitions by @zktuong in #187
fix the column names by @zktuong in #188
Update _trajectory.py by @zktuong in #189
Pseudobulking improvements by @ktpolanski in #193
Further tweaks by @ktpolanski in #194
Update api.rst by @zktuong in #197
save calculate threshold plot by @zktuong in #200
Adjust toggling of productive/non-productive filtering for setup pseudobulk by @zktuong in #201
compute_pseudobulk_gex by @zktuong in #202
change update_metadata to always reinitialise by @zktuong in #203
two quick pseudobulking fixes by @ktpolanski in #204
Customise setup pseudobulk by @zktuong in #205
refactor pseudobulking, update gex by @ktpolanski in #206
add tests by @zktuong in #208
singularity changeo pipeline by @ktpolanski in #207
quickstart tutorial by @ktpolanski in #209
place nxviz as an external submodule by @zktuong in #210
update notebooks by @zktuong in #211
Update external by @zktuong in #212
fix api documentation styling by @zktuong in #214

New Contributors

@suochenqu made their first contribution in #185

Full Changelog: v0.2.4...v0.3.0

Contributors

ktpolanski, zktuong, and suochenqu

Assets 2

07 Jul 15:32

zktuong

v0.2.4

69ae2e2

v0.2.4

What's Changed

slicing and check contigs by @zktuong in #159
add new functions and rework github actions by @zktuong in #161

New features

slicing functionality

the Dandelion object can now be sliced like a AnnData, or pandas DataFrame!

vdj[vdj.data['productive'] == 'T']
Dandelion class object with n_obs = 38 and n_contigs = 94
    data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
    metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.metadata['productive_VDJ'] == 'T']
Dandelion class object with n_obs = 17 and n_contigs = 36
    data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
    metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.metadata_names.isin(['cell1', 'cell2', 'cell3', 'cell4', 'cell5'])]
Dandelion class object with n_obs = 5 and n_contigs = 20
data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

vdj[vdj.data_names.isin(['contig1','contig2','contig3','contig4','contig5'])]
Dandelion class object with n_obs = 2 and n_contigs = 5
data: 'sequence_id', 'sequence', 'rev_comp', 'productive', 'v_call', 'd_call', 'j_call', 'sequence_alignment', 'germline_alignment', 'junction', 'junction_aa', 'v_cigar', 'd_cigar', 'j_cigar', 'stop_codon', 'vj_in_frame', 'locus', 'junction_length', 'np1_length', 'np2_length', 'cell_id', 'c_call', 'consensus_count', 'duplicate_count', 'rearrangement_status'
metadata: 'locus_VDJ', 'locus_VJ', 'productive_VDJ', 'productive_VJ', 'v_call_VDJ', 'd_call_VDJ', 'j_call_VDJ', 'v_call_VJ', 'j_call_VJ', 'c_call_VDJ', 'c_call_VJ', 'junction_VDJ', 'junction_VJ', 'junction_aa_VDJ', 'junction_aa_VJ', 'v_call_B_VDJ', 'd_call_B_VDJ', 'j_call_B_VDJ', 'v_call_B_VJ', 'j_call_B_VJ', 'productive_B_VDJ', 'productive_B_VJ', 'v_call_abT_VDJ', 'd_call_abT_VDJ', 'j_call_abT_VDJ', 'v_call_abT_VJ', 'j_call_abT_VJ', 'productive_abT_VDJ', 'productive_abT_VJ', 'v_call_gdT_VDJ', 'd_call_gdT_VDJ', 'j_call_gdT_VDJ', 'v_call_gdT_VJ', 'j_call_gdT_VJ', 'productive_gdT_VDJ', 'productive_gdT_VJ', 'duplicate_count_B_VDJ', 'duplicate_count_B_VJ', 'duplicate_count_abT_VDJ', 'duplicate_count_abT_VJ', 'duplicate_count_gdT_VDJ', 'duplicate_count_gdT_VJ', 'isotype', 'isotype_status', 'locus_status', 'chain_status', 'rearrangement_status_VDJ', 'rearrangement_status_VJ'

not sure implementing it like adata[:, adata.var.something] make sense as it's not really row information in the data slot?
also the base slot in Dandelion is .data, and doesn't make sense for .metadata to be the 'row'
maybe scverse/scirpy#327 can come up with a better strategy and i can adopt that later on.

`ddl.pp.check_contigs`

created a new function ddl.pp.check_contigs as a way to just check if contigs are ambiguous, rather than outright removing them. I envisage that this will eventually replace simple mode in ddl.pp.filter_contigs in the future.
- new column in .data: ambiguous, T/F to indicate whether contig is considered ambiguous or not (different from cell level ambiguous).
- the .metadata and several other functions ignores any contigs marked as T to maintain the same behaviour
- The largest difference between ddl.pp.check_contigs and ddl.pp.filter_contigs is that the onus is on the user to remove any 'bad' cells from the GEX data (illustrated in the tutorial) with check_contigs whereas this happens semi-automatically with filter_contigs.

`ddl.update_metadata` now comes with a 'by_celltype' option

This brings a new feature - B cell, alpha-beta T cell and gamma-delta T cell associated columns for V,D,J,C and productive columns!
- this is achieved through a new .retrieve_celltype subfunction in the Query class, which breaks up the retrieval into the 3 major groups if by_celltype = True.
- No longer the need to guess which belongs to which and allows for easy slicing! This does cause a bit of .obs bloating.
- Which leads to the removal of constant_status_VDJ, constant_status_VJ, productive_status_VDJ, productive_status_VJ as the metadata is getting bloated with the slight rework of Dandelion metadata slot to account for the new B/abT/gdT columns

`tl.productive_ratio`

Calculates a cell-level representation of productive vs non-productive contigs.
- Plotting is achieved through pl.productive_ratio

`tl.vj_usage_pca`

Computes PCA on a cell-level representation of V/J gene usage across designated groupings
- uses scanpy.pp.pca internally
- Plotting can be achieved through scanpy.pl.pca

bug fixes

fix cell ordering issue scverse/scirpy#347
small refactor of ddl.pp.filter_contigs
- moved some of the repetitive loops into callable functions
- deprecate filter_vj_chains argument and replaced with filter_extra_vdj_chains and filter_extra_vj_chains to hopefully enable more interpretable behaviour. fixes #158
- umi adjustment step was buggy but i have now made the behaviour consistent with how it functions in ddl.pp.check_contigs
rearrangement_status_VDJ and rearrangement_status_VJ (renamed from rearrangement_VDJ_status and rearrangement_VJ_status) from now gives a single value for whether a chimeric rearrangement occured e.g. TRDV pairing with TRAJ and TRAC as in this paper: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4267242/
fixed issues with progress bars getting out of hand
fixed issue with ddl.tl.find_clones crashing if more than 1 type of loci is found in the data.
- now a B, abT and gdT prefix will be appended to BCR/TR-ab/TR-gd clones.
check_contigs, find_clones and define_clones were removing non-productive contigs even though there's no need to. May cause issues with filter_contigs... but there's a problem for next time.
fix issue with min_size in network not behaving as intended. switch to using connected components to find which nodes to trim

other changes

new column chain_status, to summarise the reworked locus_status column.
- Should contain values like ambiguous, Orphan VDJ, Single pair etc, similar to chain_pairing in scirpy.
Also fixed the ordering of metadata to make it more presentable, instead of just randomly slotting into the...

Contributors

zktuong

Assets 2

27 Jun 13:52

zktuong

v0.2.3

801f5bd

v0.2.3

same as v0.2.2 but i seemed to have messed up the upload to pypi. so trying again.

What's Changed

try and add youtube video to docs by @zktuong in #148
testing_rpy2_update by @zktuong in #150
Speed upgrade - Refactor generate network by @zktuong in #152
remove nxviz from requirements by @zktuong in #157

Bug fixes and Improvements

Speed up generate_network
- pair-wise hamming distance is calculated on per clone/clonotype only if more than 1 cell is assigned to a clone/clonotype
- .distance slot is removed and is now directly stored/converted from the .graph slot.
- new options:
  - compute_layout: bool = True. If dataset is too large, generate_layout can be switched to False in which case only the networkx graph is returned. The data can still be visualised later with scirpy's plotting method (see below).
  - layout_method: Literal['sfdp', 'mod_fr'] = 'sfdp'. New default uses the ultra-fast C++ implemented sfdp_layout algorithm in graph-tools to generate final layout. sfdp stands for Scalable Force Directed Placement.
    - Minor caveat is that the repulsion is not as good - when there's a lot of singleton nodes, they don't separate well unless you some how work out which of the parameters in sfdp_layout to tweak will produce an effective separate. changing gamma alone doesn't really seem to do much.
    - The original layout can still be generated by specifying layout_method = 'mod_fr'. Requires a separate installation of graph-tool via conda (not managed by pip) as it has several C++ dependencies.
    - pytest on macos may also stall because of a different backend being called - this is solved by changing tests that calls generate_network to run last.
- added steps to reduce memory hogging.
- min_size was doing the opposite previously and this is now fixed. #155
Speed up transfer
- Found a faster way to create the connectivity matrix.
- this also now transfer a dictionary that scirpy can use to generate the plots scverse/scirpy#286
- Fix #153
  - rename productive to productive_status.
Fix #154
- reorder the if-else statements.
Speed up filter_contigs
- tree construction is simplified and replaced for-loops with dictionary updates.
Speed up initialise_metadata. Dandelion should now initialise and read faster.
- Removed an unnecessary data sanitization step when loading data.
- Now load_data will rename umi_count to duplicate_count
- Speed up Query
  - tree construction is simplified and replaced for-loops with dictionary updates.
  - didn't need to use an airr validator as that slows things down.
data initialised by Dandelion will be ordered based on productive first, then followed by umi count (largest to smallest).

Breaking Changes

initialise_metadata/update_metadata/Dandelion
- For-loops to initialise the object has veen vectorized, resulting in a minor speed uprade
- This results in reduction of some columns in the .metadata which were probably bloated and not used.
  - vdj_status and vdj_status_summary removed and replaced with rearrangement_VDJ_status and rearrange_VJ_status
  - constant_status and constant_summary removed and replaced with constant_VDJ_status and constant_VJ_status.
  - productive and productive_summary combined and replaced with productive_status.
  - locus_status and locus_status_summary combined and replaced with locus_status.
  - isotype_summary replaced with isotype_status.
where there was previously unassigned or '' has been changed to :str: None in .metadata.
- Not changed to NoneType as there's quite a bit of text processing internally that gets messed up if swapped.
- No_contig will still be populated after transfer to AnnData to reflect cells with no TCR/BCR info.
deprecate use of nxviz<0.7.4
- reworked code to use the updated version at https://github.com/zktuong/nxviz/tree/custom_color_mapping_circos_nodes_and_edges

Minor changes

Rename and deprecate read_h5/write_h5. Use of read_h5ddl/write_h5ddl will be enforced in the next update.

Full Changelog: v0.2.1...v0.2.2

Contributors

zktuong

Assets 2

27 Jun 11:57

zktuong

v0.2.2

801f5bd

v0.2.2

What's Changed

try and add youtube video to docs by @zktuong in #148
testing_rpy2_update by @zktuong in #150
Speed upgrade - Refactor generate network by @zktuong in #152
remove nxviz from requirements by @zktuong in #157

Bug fixes and Improvements

Speed up generate_network
- pair-wise hamming distance is calculated on per clone/clonotype only if more than 1 cell is assigned to a clone/clonotype
- .distance slot is removed and is now directly stored/converted from the .graph slot.
- new options:
  - compute_layout: bool = True. If dataset is too large, generate_layout can be switched to False in which case only the networkx graph is returned. The data can still be visualised later with scirpy's plotting method (see below).
  - layout_method: Literal['sfdp', 'mod_fr'] = 'sfdp'. New default uses the ultra-fast C++ implemented sfdp_layout algorithm in graph-tools to generate final layout. sfdp stands for Scalable Force Directed Placement.
    - Minor caveat is that the repulsion is not as good - when there's a lot of singleton nodes, they don't separate well unless you some how work out which of the parameters in sfdp_layout to tweak will produce an effective separate. changing gamma alone doesn't really seem to do much.
    - The original layout can still be generated by specifying layout_method = 'mod_fr'. Requires a separate installation of graph-tool via conda (not managed by pip) as it has several C++ dependencies.
    - pytest on macos may also stall because of a different backend being called - this is solved by changing tests that calls generate_network to run last.
- added steps to reduce memory hogging.
- min_size was doing the opposite previously and this is now fixed. #155
Speed up transfer
- Found a faster way to create the connectivity matrix.
- this also now transfer a dictionary that scirpy can use to generate the plots scverse/scirpy#286
- Fix #153
  - rename productive to productive_status.
Fix #154
- reorder the if-else statements.
Speed up filter_contigs
- tree construction is simplified and replaced for-loops with dictionary updates.
Speed up initialise_metadata. Dandelion should now initialise and read faster.
- Removed an unnecessary data sanitization step when loading data.
- Now load_data will rename umi_count to duplicate_count
- Speed up Query
  - tree construction is simplified and replaced for-loops with dictionary updates.
  - didn't need to use an airr validator as that slows things down.
data initialised by Dandelion will be ordered based on productive first, then followed by umi count (largest to smallest).

Breaking Changes

initialise_metadata/update_metadata/Dandelion
- For-loops to initialise the object has veen vectorized, resulting in a minor speed uprade
- This results in reduction of some columns in the .metadata which were probably bloated and not used.
  - vdj_status and vdj_status_summary removed and replaced with rearrangement_VDJ_status and rearrange_VJ_status
  - constant_status and constant_summary removed and replaced with constant_VDJ_status and constant_VJ_status.
  - productive and productive_summary combined and replaced with productive_status.
  - locus_status and locus_status_summary combined and replaced with locus_status.
  - isotype_summary replaced with isotype_status.
where there was previously unassigned or '' has been changed to :str: None in .metadata.
- Not changed to NoneType as there's quite a bit of text processing internally that gets messed up if swapped.
- No_contig will still be populated after transfer to AnnData to reflect cells with no TCR/BCR info.
deprecate use of nxviz<0.7.4
- reworked code to use the updated version at https://github.com/zktuong/nxviz/tree/custom_color_mapping_circos_nodes_and_edges

Minor changes

Rename and deprecate read_h5/write_h5. Use of read_h5ddl/write_h5ddl will be enforced in the next update.

Full Changelog: v0.2.1...v0.2.2

Contributors

zktuong

Assets 2

19 May 16:50

zktuong

v0.2.1

85d4fa0

v0.2.1

What's Changed

Update documentation by @zktuong in #142
Add google colab button by @zktuong in #143
Fix pytest issue on macos by @zktuong in #144
v0.2.1 by @zktuong in #147
- minor QoL changes
- remove skbio from dependencies

Full Changelog: v0.2.0...v0.2.1

Contributors

zktuong

Assets 2

Releases: zktuong/dandelion

v0.3.5

What's Changed

New Contributors

Contributors

v0.3.4

Summary

What's Changed

dependabot updates

New Contributors

Contributors

v0.3.3

What's Changed

Contributors

v0.3.2

What's Changed

New Contributors

Contributors

v0.3.1

What's Changed

Contributors

v0.3.0

What's Changed

Detailed PRs

New Contributors

Contributors

v0.2.4

What's Changed

New features

slicing functionality

ddl.pp.check_contigs

ddl.update_metadata now comes with a 'by_celltype' option

tl.productive_ratio

tl.vj_usage_pca

bug fixes

other changes

Contributors

v0.2.3

What's Changed

Bug fixes and Improvements

Breaking Changes

Minor changes

Contributors

v0.2.2

What's Changed

Bug fixes and Improvements

Breaking Changes

Minor changes

Contributors

v0.2.1

What's Changed

Contributors

`ddl.pp.check_contigs`

`ddl.update_metadata` now comes with a 'by_celltype' option

`tl.productive_ratio`

`tl.vj_usage_pca`