Releases · kundajelab/tfmodisco

28 Jan 22:03

v0.5.16.4.1

178db48

Actually fix corresponding to 0.5.16.4 Pre-release

Pre-release

Fix in https://github.com/kundajelab/tfmodisco/releases/tag/v0.5.16.4 reported to not work. Should have done min(perplexity, matrix.shape[0]-1) rather than min(perplexity, matrix.shape[0]). New fix in light of message on #112

Assets 2

27 Jan 01:45

AvantiShri

v0.5.16.4

6cfa0dd

Fixing "perplexity must be less than n_samples" Pre-release

Pre-release

Reported in issue #112

Bug was caused by the addition of a feature added in a later release (subclustering within motifs and visualization of the subclusters using t-sne). Fix is to put in a check to reduce the perplexity relative to the default if the number of seqlets in the motif is less than the default perplexity.

Assets 2

18 May 22:19

AvantiShri

v0.5.16.3

01a92d0

Fix tsne sparse input matrix error Pre-release

Pre-release

Corresponds to PR #108 by @akmorrow13

Contributors

akmorrow13

Assets 2

27 Jan 05:43

AvantiShri

v0.5.16.2

c1cbf7c

Bringing down Leiden memory use - patch 1 Pre-release

Pre-release

Corresponds to PR #99

Removed some tolist() commands that might have been contributing to memory explosion
More detailed printouts of memory usage
Made it possible to specify a different number of parallel runs for the main clustering step via the n_cores_mainclustering argument to TfModiscoSeqletsToPatternsFactory

Assets 2

29 Nov 04:30

AvantiShri

v0.5.16.0

b136c20

Lower mem for agkm embeddings, pynnd option for coarse affmat Pre-release

Pre-release

Added pynnd=True option to use pynn descent for coarse-grained affinity matrix computation (caveat: runs into a weird pickling error on Colab: lmcinnes/pynndescent#133)
Noticed that storing the agkm embeddings as [(agkm_string_representation, value), ...] seemed to take up a lot of space (possibly because representing the agkms as strings is space-consuming? So now they get converted to [(agkm_idx, value)...] before being stored. This seems to bring down the memory consumption.
Other minor changes pertaining to reporting some internal hit-scoring-related metrics (exclude_self excludes the self when benchmarking how well the fann_perclass (finegrained-affinity nearest-neighbors) method works for recovering the true class for motif hits, since the fine-grained affinity to the self is always 1; also added benchmarking of how well simply using aggregate similarity works)
Also did some reorganization of example notebooks that I mainly use to test out stuff - put some of the more experimental notebooks under "examples/simulated_TAL_GATA_deeplearning/other"
Updating Leiden version to avoid the segfault bug (vtraag/leidenalg#68)

Assets 2

29 Nov 07:05

AvantiShri

v0.5.15.1

cb2ec8e

Added CircleCI continuous integration Pre-release

Pre-release

Corresponds to PR #98

Added circleci continuous integration
Removed the .travis.yml
Bumped the version from 0.5.16.0->0.5.16.1
Added a badge to the github readme
No tfmodisco code changes

Assets 2

20 Aug 21:22

AvantiShri

v0.5.15.0

7de50c1

Improvements to hit scoring Pre-release

Pre-release

Corresponds to PR #94

Emphasis is now given to the core seqlet region when figuring out which motif the seqlet aligns to, such that the presence of alternative motifs in the flanks can't change the motif assignment. Also revamped how the fine-grained affinities are calculated (there is no core-grained calculation step; I just first align the core seqlet to the aggregate pattern, and then use that alignment to compute the fine-grained similarities to the constituent seqlets in the pattern; it seems substantially faster)

Also improved the seqlet identification method:

I switched to FixedWindow seqlet identification method, which is the same one used during the main modisco run, because the VariableWindow method was resulting in a lot of windows that were "tied" at an FDR of 0, even though some windows were much more high-scoring than others.
I made a small tweak to improve the overlap exclusion (need to exclude core_window_size-0.5 on either side...the reason is just a really involved detail to do with indexing math)
Put in a feature to allow for only returning postive-scoring seqlets, is on by default.

Also, with the Leiden bugfix, I'm back to using movenodes in the refine partition

Also, the hits now return trim_start and trim_end, which are the start/end of the trimmed pattern (the user can specify the IC; default 0.3)

Assets 2

12 Jul 13:15

AvantiShri

v0.5.14.2

77e89d0

Added final flank expansion functionality back in Pre-release

Pre-release

Corresponds to PR #93; version 0.5.14.0 accidentally removed the final motif flank expansion that was controlled by the parameter "final_flank_to_add", such that the flank expansion was effectively 0 (note: this only affected the flank expansion that was done at the very end of the tfmodisco pipeline; there is still flank expansion controlled by the parameter "initial_flank_to_add"). I added the functionality for final flank expansion back in, and for backwards compatibility with version 0.5.14.0 I have set the default value of final_flank_to_add to be 0 (previously, it was 10; a default of 0 is actually probably better from a user perspective, because sometimes users run tf-modisco on very short sequences, and having a large final_flank_to_add can cause many seqlets to get discarded when the expansion extends beyond the end of the sequence). I also cleaned up some of the notebooks.

Assets 2