Releases: msmbuilder/msmbuilder
MSMBuilder 3.8.0
We're pleased to annoounce the release of MSMBuilder 3.8. This release features updates and improvements to contact featurizers, kernel tICA, HMMs, and preprocessing. There are also some bugfixes and API hygiene improvements. We recommend all users upgrade to MSMBuilder 3.8.
New Features
ContactFeaturizer
now lets you use asoft_min
option for closest contact distances.
Improvements
- The
stride
parameter inKernelTICA
now works as intended to automatically generate a set of landmark points (gh-972). - The
contacts
parameter inCommonContactFeaturizer
now performs as the contacts method in regularContactFeaturizer
albeit after validating all the contacts. GaussianHMM
andVonMisesHMM
are now compatible withsklearn.pipeline.Pipeline
workflows (gh-980).msmbuilder.preprocessing
is now compatible withsklearn.pipeline.Pipeline
workflows (gh-987).- Fixed error in pickling HMMs (gh-996).
MSMBuilder 3.7.0
We're pleased to announce the release of MSMBuilder 3.7. This release introduces several new featurizers that can handle multiple sequences or multiple chains within a topology file. There are also some bugfixes and API hygiene improvements. We recommend all users upgrade to MSMBuilder 3.7.
API Changes
TrajFeatureUnion
andSubsetFeatureUnion
have been removed due to incompatibilities with thescikit-learn
API.
New Features
KSparseTICA
lets you specify the number of non-zero entries,k
rather than a regularization strength (gh-916).BootStrapMarkovStateModel
optionally saves all the models that it generates (gh-919).tICA
supports commute mapping (see 10.1021/acs.jctc.6b00762) (gh-925).CommonContactFeaturizer
featurizes different trajectories with different topologies using a common set of inter-residue contacts (gh-876).msmbuilder.tpt.mfpt.mfpts
can now compute distributions of MFPTs, accounting for the model error due to finite sampling.- Three new featurization schemes for protein-ligand trajectories are now available:
LigandContactFeaturizer
,BinaryLigandContactFeaturizer``, and
LigandRMSDFeaturizer` (gh-883).
Improvements
MSMBuilder 3.6.1
MSMBuilder 3.6
We're pleased to announce the release of MSMBuilder 3.6. This release
introduces project templating and a whole host of new sklearn
estimators.
There are also some bugfixes and API hygiene improvements. We recommend all
users upgrade to MSMBuilder 3.6.
API Changes
version.short_version
is now 3.y instead of 3.y.z (gh-829).weighted_transform
is no longer supported in tICA methods (gh-807). Please
usedkinetic_mapping
.- The cached filenames and formats for DoubleWell, QuadWell,
and MullerPotential example datasets have changed. The API through
msmbuilder.example_datasets
is still the same, but the data may
be re-generated instead of using a cached version from a previous installation
of MSMBuilder (gh-854). - The alias for Ward clustering has been removed. Modelers should now use
LandmarkAgglomerative(linkage='ward')
(gh-874). Ward clustering is also
available inAgglomerativeClustering
, but without a prediction algorithm.
New Features
Butterworth
,DoubleEWMA
,StandardScaler
,RobustScaler
are
available via the command line (gh-895).BinaryContactFeaturizer
featurizes a trajectory into a
boolean array corresponding to whether each residue-residue
distance is below a cutoff (gh-798).LogisticContactFeaturizer
produces a logistic transform
of residue-residue distances about a center distance (gh-798).FactorAnalysis
,FastICA
, andKernelPCA
are available in the
decomposition
module (gh-807).Butterworth
,EWMA
, andDoubleEWMA
are available in the
preprocessing
module (gh-818).- We encourage users to download the
msmb_data
conda package to easily
install example data. The data can be loaded through existing methods
inmsmbuilder.example_datasets
(gh-854, gh-867). - An example dataset
MinimalFsPeptide
is available. This is a strided
version of the existingFsPeptide
dataset. We use it for testing,
when a fully-converged dataset is not required (gh-867). - Project templates! Read the new tutorial or the :ref:
io
page for
details (gh-768). LandmarkAgglomerative
clustering now features theward
linkage
option. An algorithm for predicting cluster assignments with the
ward
objective function has been developed and implemented (gh-874).
Improvements
- Remove a unicode character from
ktica.py
(gh-833) msmbuilder.decomposition.KernelTICA
now includes all parameters in its
__init__
, making it compatible with Osprey (gh-823).msmbuilder.tpt
methods can now handleBayesianMarkovStateModels
as
input. Please note that we still do not recommend using this module with
BootStrapMarkovStateModel
.
MSMBuilder 3.5
We're pleased to announce the release of MSMBuilder 3.5. This release wraps more relevant sklearn
estimators and transformers. There are also some bugfixes and API hygiene improvements. We recommend all users upgrade to MSMBuilder 3.5.
API Changes
msmbuilder.featurizer.FeatureUnion
is now deprecated. Please usemsmbuilder.feature_selection.FeatureSelector
instead (#799).msmbuilder.feature_extraction
has been added to conform to thescikit-learn
API. This is essentially an alias ofmsmbuilder.featurizer
(#799).
New Features
KernelTICA
,Nystroem
, andLandmarkNystroem
are available in thedecomposition
module (#807).FeatureSelector
andVarianceThreshold
are available in thefeature_selection
module (#799)SparsePCA
andMiniBatchSparsePCA
are available in thedecomposition
module (#791).Binarizer
,FunctionTransformer
,Imputer
,KernelCenterer
,LabelBinarizer
,MultiLabelBinarizer
,MinMaxScaler
,MaxAbsScaler
,Normalizer
,RobustScaler
,StandardScaler
,
andPolynomialFeatures
are available in thepreprocessing
module (#796).
Improvements
MSMBuilder 3.4
We're pleased to announce MSMBuilder 3.4. It contains a plethora of new
features, bug fixes, and improvements.
API Changes
- Range-based slicing on dataset objects is no longer allowed. Keys in the
dataset object don't have to be continuous. The empty slice, e.g.ds[:]
loads all trajectories in a list (#610). - Ward clustering has been renamed AgglomerativeClustering in scikit-learn.
Please use the new msmbuilder wrapper class AgglomerativeClustering. An
alias for Ward has been made available (#685). PCCA.trimmed_microstates_to_macrostates
has been removed. This
dictionary was actually keyed by untrimmed microstate labels.
PCCA.transform
would throw an exception when operating on a system
with trimming because it was using this misleading dictionary. Please use
pcca.microstate_mapping_
for this functionality (#709).UnionDataset
has been removed after deprecation in 3.3. Please use
FeatureUnion
instead (#671).SubsetFeaturizer
and ilk have been removed from the
msmbuilder.featurizer
namespace. Please import them from
msmbuilder.featurizer.subset
(#738).FirstSlicer
has been removed. UseSlicer(first=x)
for the same
functionality (#738).msmbuilder.featurizer.load
has been removed.Featurizer.save
has been removed. Please useutils.load
,utils.dump
(#738).
New Features
- Dataset objects can call,
fit_transform_with()
to simplify the
common pattern of applying an estimator to a dataset object to produce a
new dataset object (#610). kinetic_mapping
is a new option totICA
. It's similar to
weighted_transform
, but based on a better theoretical framework.
weighted_transform
is deprecated (#766).VonMisesFeaturizer
uses soft bins around the unit-circle to give an
alternate representation of dihedral angles (#744).MarkovStateModel
has apartial_transform()
method (#707).KappaAngleFeaturizer
is available via the command line (#681).MarkovStateModel
has a new attribute,percent_retained_
, for
ergodic trimming (#689).AlphaAngleFeaturizer
computes the dihedral angles between alpha
carbons (#691).FunctionFeaturizer
computes features based on an arbitrary Python
function or callable (#717).- Automatic State Partitioning (APM) uses kinetic information to cluster
conformations (#748).
Improvements
- Consistent counts setup and ergodic cutoff across various flavors of
Markov models (#718, #729, #701, #705). - Tests no longer depend on
sklearn.hmm
, which has been removed (#690). - Improvements to
RSMDFeaturizer
(#695, #764). SparseTICA
is completely re-written with large performance
improvements when dealing with large numbers of features (#704).- Links for downloading example data are un-broken after figshare
changed URLs (#751).
MSMBuilder v3.3.1
This point release is for compatibility with scikit-learn version 0.17
- Ward clustering has been renamed AgglomerativeClustering in scikit-learn.
Please use the new msmbuilder wrapper class AgglomerativeClustering. An
alias for Ward has been made available.
MSMBuilder v3.3.0
We're pleased to announce the release of MSMBuilder v3.3.0. The focus of this
release is a completely re-written module for constructing HMMs as well as bug
fixes and incremental improvements.
API Changes
FeatureUnion
is an estimator that deprecates the functionality of
UnionDataset
. Passing a list of paths todataset()
will no longer
automatically yield aUnionDataset
. This behavior is still available by
specifyingfmt="dir-npy-union"
, but is deprecated (#611).- The command line flag for featurizers
--out
(deprecated in 3.2) now saves
the featurizer as a pickle file (#546). Please use--transformed
for the
old behavior. This is consistent with other command-line commands. - The default number of timescales in
MarkovStateModel
is now one less than
the number of states (was 10). This addresses some bugs with
implied_timescales
and PCCA(+) (#603).
New Features
GaussianHMM
andVonMisesHMM
is rewritten to feature higher code reuse
and code quality (#583, #582, #584, #572, #570).KDTree
can find n nearest points to e.g. a cluster center (#599).Slicer
featurizer can slice feature arrays as part of a pipeline
(#567).
Improvements
PCCAPlus
is compatible with scipy 0.16 (#620).- Documentation improvements (#618, #608, #604, #602)
- Test improvements, especially for Windows (#593, #590, #588, #579, #578,
#577, #576) - Bug fix:
MarkovStateModel.sample()
produced trajectories of incorrect
length. This function is still deprecated (#556). - Bug fix: The muller example dataset did not respect users' specifications for
initial coordinates (#631). MarkovStateModel.draw_samples
failed if discrete trajectories did not
contain every possible state (#638). Function can now accept a single
trajectory, as well as a list of them.SuperposeFeaturizer
now respects the topology argument when loading the
reference trajectory (#555).
MSMBuilder v3.2.0
We're pleased to announce the release of MSMBuilder v3.2.0. With this release:
tICA
ignores too-short trajectories during fitting instead of raising
an exception- New methods for sampling from MSM models have been added
- Datasets can be opened in "append" mode
- Compatibility with sklearn 0.16 was addressed
utils.dump
saves using the pickle protocol.utils.load
is backwards
compatible.- The command line flag for featurizers
--out
is deprecated. Use
--transformed
instead. This is consistent with other command-line
commands. - Various bug fixes were addressed
MSMBuilder 3.1.0
v3.1 (Feb 27, 2015)
- Numerous improvements to
ContinuousTimeMSM
optimization - Switch
ContinuousTimeMSM.score
to transmat-style GMRQ - New example dataset with Muller potential
- Assorted bug fixes in the command line layer