0.17rc (#317) · tensorflow/similarity@17ec76d

Commit

0.17rc (#317)
* [nightly] Increase version to 0.15.0.dev64

* Updates for contrastive model saving.

* [nightly] Increase version to 0.15.0.dev65

* [nightly] Increase version to 0.15.0.dev66

* [nightly] Increase version to 0.15.0.dev67

* [nightly] Increase version to 0.15.0.dev68

* [nightly] Increase version to 0.15.0.dev69

* [nightly] Increase version to 0.15.0.dev70

* [nightly] Increase version to 0.15.0.dev71

* Update losses to use Loss reduction.

Losses previously computed the mean loss over the examples within the call() method. This may create issues when using multi GPU training. The call() method now returns the per example loss, and the final loss is computed using the losses.Loss reduction method.

We also updated the from_config() method to include the parent class's reduction and name args.

* Resnet18 returns as a SimilarityModel.

We may want Resnet18 as a regular model, but keeping the output type as
SimilarityModel to avoid mixed output types.

* Fix various mypy and linter errors.

* Add support for contrastive_model save and load.

* Update unsupervised notebook with save and load.

* Update the save and load.

Add updated example and docs for save and load in the supervised hello world.

* Updates to visualization notebook.

* [nightly] Increase version to 0.15.0.dev72

* Unsupervised notebook update.

* [nightly] Increase version to 0.15.0.dev73

* [nightly] Increase version to 0.15.0.dev74

* [nightly] Increase version to 0.15.0.dev75

* [nightly] Increase version to 0.15.0.dev76

* Notes on the unsupervised notebook draft.

* [nightly] Increase version to 0.15.0.dev77

* [nightly] Increase version to 0.15.0.dev78

* [nightly] Increase version to 0.15.0.dev79

* Remove get_backbone() method and just have users access the backbone attribute directly.

* Add new diagrams and updated copy to teh unsupervised notebook.

* [nightly] Increase version to 0.15.0.dev80

* [nightly] Increase version to 0.15.0.dev81

* First finished draft of unsupervised_hello_world notebook

* Updates to the README file. Add self-supervised info.

* [nightly] Increase version to 0.15.0.dev82

* [nightly] Increase version to 0.15.0.dev83

* Update README.md

* Remove augmentation arg from architectures.

Architectures previously took a callable stack of augmentation layers
that would be added after the input of the model. This could cause
issues with saving and training on TPU. Users are now expected to add
augmentation to either the data samplers / datasets or manually add it
to the model.

* Clean up example dir.

* Fix flake8 errors in architectures.

* Update API docs.

* Bump version to 0.15.0

* Bump minor version to 0.16.0.dev0

* [nightly] Increase version to 0.16.0.dev1

* [nightly] Increase version to 0.16.0.dev2

* [nightly] Increase version to 0.16.0.dev3

* Distance and losses refactor (#222)

* refactor distances call signature and add appropriate tests

* refactor metrics for new distance call signature

* make similarity losses compatible with asymmetric and non-square distance matrices

* adapt and add test

* remove newline

* [nightly] Increase version to 0.16.0.dev4

* [nightly] Increase version to 0.16.0.dev5

* [nightly] Increase version to 0.16.0.dev6

* [nightly] Increase version to 0.16.0.dev7

* [nightly] Increase version to 0.16.0.dev8

* Cross-batch memory (XBM) (#225)

* initiate XBM loss

* add todo

* add XBM tests

* WIP: XBM serialization

* XBM serialization

* class docstring

* remove todo

* improve docstring

* remove comment

* [nightly] Increase version to 0.16.0.dev9

* [nightly] Increase version to 0.16.0.dev10

* [nightly] Increase version to 0.16.0.dev11

* [nightly] Increase version to 0.16.0.dev12

* [nightly] Increase version to 0.16.0.dev13

* [nightly] Increase version to 0.16.0.dev14

* [nightly] Increase version to 0.16.0.dev15

* [nightly] Increase version to 0.16.0.dev16

* [nightly] Increase version to 0.16.0.dev17

* [nightly] Increase version to 0.16.0.dev18

* [nightly] Increase version to 0.16.0.dev19

* [nightly] Increase version to 0.16.0.dev20

* [nightly] Increase version to 0.16.0.dev21

* [nightly] Increase version to 0.16.0.dev22

* Augmentor for Barlow Twins (#229)

* Use list(range()) instead of comprehension as it is more pythonic.

* Create barlow.py

* Bump three in /tensorflow_similarity/visualization/projector_v2 (#228)

Bumps [three](https://github.com/mrdoob/three.js) from 0.132.2 to 0.137.0.
- [Release notes](https://github.com/mrdoob/three.js/releases)
- [Commits](https://github.com/mrdoob/three.js/commits)

---
updated-dependencies:
- dependency-name: three
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Restructure class to be like Augmenter

* Minor fixing of dead links (#230)

* Fixed dead links

* augmenter main to master

* Spelling changes Auto Augment

* MixupAndCutmix main to master

* RandAugment main to master

* RandomErasing main to master

* Update SimCLRAugmenter.md

* Update ClassificationMatch.md

* Update ClassificationMetric.md

* Update Evaluator.md

* Update MemoryEvaluator.md

* Update SimilarityModel.md

* Update BinaryAccuracy.md

* Update F1Score.md

* Update FalsePositiveRate.md

* Update NegativePredictiveValue.md

* Update Precision.md

* Update Recall.md

Co-authored-by: Owen Vallis <owensvallis@gmail.com>

* Fix minor typos (#226)

Co-authored-by: Owen Vallis <owensvallis@gmail.com>

* Update barlow.py

* Update barlow.py

* Update setup.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* revisions

* Update __init__.py

* Update __init__.py

* Update color_jitter.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update setup.py

Co-authored-by: Owen S Vallis <ovallis@google.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Owen Vallis <owensvallis@gmail.com>
Co-authored-by: Genrry Hernandez <genrryhernandez@gmail.com>

* Fixed some bugs in augmenter. (#232)

* Create barlow.py

* Restructure class to be like Augmenter

* Update barlow.py

* Update barlow.py

* Update setup.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* revisions

* Update __init__.py

* Update __init__.py

* Update color_jitter.py

* Update barlow.py

* Update barlow.py

* Update barlow.py

* Update setup.py

* fixed some bugs

* Remove seed instance variable

Co-authored-by: Owen Vallis <owensvallis@gmail.com>

* [nightly] Increase version to 0.16.0.dev23

* [nightly] Increase version to 0.16.0.dev24

* [nightly] Increase version to 0.16.0.dev25

* [nightly] Increase version to 0.16.0.dev26

* [nightly] Increase version to 0.16.0.dev27

* [nightly] Increase version to 0.16.0.dev28

* [nightly] Increase version to 0.16.0.dev29

* [nightly] Increase version to 0.16.0.dev30

* [nightly] Increase version to 0.16.0.dev31

* [nightly] Increase version to 0.16.0.dev32

* [nightly] Increase version to 0.16.0.dev33

* [nightly] Increase version to 0.16.0.dev34

* [nightly] Increase version to 0.16.0.dev35

* [nightly] Increase version to 0.16.0.dev36

* [nightly] Increase version to 0.16.0.dev37

* [nightly] Increase version to 0.16.0.dev38

* [nightly] Increase version to 0.16.0.dev39

* [nightly] Increase version to 0.16.0.dev40

* [nightly] Increase version to 0.16.0.dev41

* [nightly] Increase version to 0.16.0.dev42

* [nightly] Increase version to 0.16.0.dev43

* [nightly] Increase version to 0.16.0.dev44

* [nightly] Increase version to 0.16.0.dev45

* [nightly] Increase version to 0.16.0.dev46

* Added test coverage for augmentation functions + barlow, simCLR augmenter  (#235)

* Create test_blur.py

* Create test_color_jitter.py

* Create test_crop.py

* Create test_flip.py

* Update test_crop.py

* Update test_color_jitter.py

* Create test_solarize.py

* Create test_augmenters.py

* Update test_flip.py

* Update test_flip.py

* Update test_flip.py

* Update blur.py

* Update blur.py

* [nightly] Increase version to 0.16.0.dev47

* Change augmenters to use augmentation_utils (#238)

* Fix corrupted JSON formatting in unsupervised notebook.

* Added features of SplitValidationLoss callback to EvalCallback (#242)

* Added features of SplitValidationLoss callback to EvalCallback

Merged SplitValidationLoss into EvalCallbaclk

* Refactored EvalCallback using utils.unpack_results

* [nightly] Increase version to 0.16.0.dev48

* [nightly] Increase version to 0.16.0.dev49

* [nightly] Increase version to 0.16.0.dev50

* VicReg Loss - Improvement of Barlow Twins (#243)

* VicReg Loss

* Update vicreg.py

* Update vicreg.py

* Update vicreg.py

* fix big bug

* Update vicreg.py

* Update vicreg.py

* fixes

* Update vicreg.py

* [nightly] Increase version to 0.16.0.dev51

* [nightly] Increase version to 0.16.0.dev52

* Update tests for algebra.py

* Coverage now at 100%
* Convert tests to use tf.testing.TestCase

* [nightly] Increase version to 0.16.0.dev53

* [nightly] Increase version to 0.16.0.dev54

* Fix corrupted formatting in visualization notebook.

* [bug] Fix multisim loss offsets.

The tfsim version of multisim uses distances instead of the inner
product. However, multisim requires that we "center" the pairwise
distances around 0. Here we add a new center param, which we set to 1.0
for cosine distance. Additionally, we also flip the lambda (lmda) param
to add the threshold to the values instead of subtracting it. These
changes will help improve the pos and neg weighting in the log1psumexp.

* [nightly] Increase version to 0.16.0.dev55

* [bug] In losses.utils.logsumexp() tf.math.log(1 + x) should be
tf.math.log(tf.math.exp(-my_max) + x). This is needed to properly
account for removing the rowwise max before computing the logsumexp.

* Make Augmentation Utilities More Customizable(reupload due to branch issues) (#255)

* modifications of benchmark

* test commit 123

* new changes to training

* testo changes

* works in colab... kind of

* code is neat now

* working on sampler problem

* Update barlow.py

* Update blur.py

* Update color_jitter.py

* Update color_jitter.py

* Update barlow.py

* Update barlow.py

* Added vicreg for sync

* Update vicreg.py

* Update vicreg.py

* Update vicreg.py

* Update barlow.py

* randomresizedcrop edits

* Update barlow.py

* allow to customize loss reduction

* Update __init__.py

* Delete sampler_test.py

* Delete benchmark/supervised directory

* Update barlow.py

* added docstring on random_resized_crop

* Allow user to set normalization

* Update barlow.py

* Update barlow.py

* Update setup.py

* remove pipfile

* Delete Pipfile

* Delete Pipfile.lock

* Update cropping.py

* Update cropping.py

* Additive multiplicative changes

* Update simclr.py

* change additive, multiplicative

* Update barlow.py

* Update solarize.py

* Update barlow.py

* Update solarize.py

* Update barlow.py

* Update test_solarize.py

* Update test_solarize.py

* Update test_solarize.py

Co-authored-by: Owen Vallis <ovallis@google.com>

* Refactor test_basic to use TestCase to improve flaky test results.

* Fix Flake8 warnings.

* Freeze all batchnorm architecture layers.

We now freeze all BN layers when loading pre-trained weights in the
effnet and resnet50 architectures. Previously, we only froze the BN
layers if trainable was partial or frozen. When trainable was full, the
BN layers would be trainable as well and this led to suboptimal training
losses.

* Improve EfficientNetSim docstring and type hints (#254)

* Fix typos in docstring

* Remove reference to image augmentation

Image augmentation was previously removed, so purge it from the comment and docstring.

* Correct input image type annotation

* Fix #251. Check for model._index before calling Indexer methods.

The Indexer is core to a number of the Similarity model methods. Add
support for checking if the index exists and return a more informative
AttributeError if the index hasn't been created yet.

* Set random seeds for tfrecord samplers test.

* All augmenters use the Tensor type from tensorflow_similarity.types.

* [nightly] Increase version to 0.16.0.dev56

* Fix Tensor type error in callbacks.

Unpacking the Lookup objects converts the python types to Tensors. This
can lead to Tensor type errors. This commit adds support for taking the
expected dtype of the model Tensors where possible.

We also fix a bug where the EvalCallback was not logging the split
metric names in the history.

* Update doc strings in color_jitter.

* Update the create index AttributeError text

* [nightly] Increase version to 0.16.0.dev57

* Update Notebook examples.

* Remove unneeded tf.function and register_keras_serializable decorators.

Subclasses of tf.keras.losses.Loss will trace all child functions and we
only need to register the subclassed loss to support deserialization.

* Simplify MetricEmbedding layer.

* Fix mypy type error in simsiam.

Convert all constants to tf.constant.

* Simplify the MetricEmbedding layer.

Subclass layers.Dense directly. This simplifies the layer and also fixes
function tracing during model save.

* Fix test_tfrecord_samplers tests.

* Update api documentation.

TODO: This just generated the new docs. We still need to go through and
clean up the documentation.

* Update doc string and api for MetricEmbedding layer.

* Bump to version 0.16

* Fix static type check error in memory store.

The np.savez functions expect array_like values but we were passing
List. Casting as np array should solve the issue.

* Fix effnet test for TF 2.9

* Fix TFRecordDatasetSampler now returns correct number of examples per batch.

* Bump dev version to 0.17.0.dev0.

* [nightly] Increase version to 0.17.0.dev1

* [nightly] Increase version to 0.17.0.dev2

* [nightly] Increase version to 0.17.0.dev3

* [nightly] Increase version to 0.17.0.dev4

* [nightly] Increase version to 0.17.0.dev5

* [nightly] Increase version to 0.17.0.dev6

* [nightly] Increase version to 0.17.0.dev7

* [nightly] Increase version to 0.17.0.dev8

* Add support for configuring and running benchmarks for supervised losses.

Add support for passing the same examples for both the query and indexed
set when calling retrieval_metrics. Added a new param to each
retrieval_metric that enables dropping the nearest neighbor. This is
useful if the nearest neighbor exists in the indexed examples.

* Update benchmark README and max line length in .flake8

* Updates to the benchmark code

- Add clean_dir func to train file.

- Add support for creating precision@k and map@k eval metrics

- Fix typing issue in map@k. We now take the class counts type from the
query label dtype.

- Remove 1 count from class counts if we are dropping the first result.

- Refactor the make functions in train to use a Dict for all the
parameterized modules.

* [nightly] Increase version to 0.17.0.dev9

* Fixed typo in slice id

* black formatting

* black formatting

* Fixed typo to resolve #284

The function should be tf.concat instead of tf.constant, according to the description given above. This also resolves issue #284

* [nightly] Increase version to 0.17.0.dev10

* Update to match the API of the latest keras_cv version

Check out keras-team/keras-cv#738 for more information.  Once this is merged we're breaking backwards compatibility to have a much nicer API name.

* Add clip_at_r to support computing MAP@R from map_at_k module.

* Refactor benchmark components into separate modules.

* Update benchmark configs to use smaller 1e-6 learning rates.

Update train.py main to loop through the various embedding sizes in the
architectures.

* Fix tests for clip_at_r in map_at_k retrieval metric.

Refactor the clip at r changes to use map_fn.

* [nightly] Increase version to 0.17.0.dev11

* Update to benchmark configs and experiments with adding LR Schedule.

* Update benchmark README

* Black formatting for map_at_k

* Add requirements file for benchmarks

* Refactor benchmark code

- Support filtering the set of experiments using a regex pattern passed
in the args.
- Add typing
- Refactor the config parsing into a separate dataclass
- Refactor the cross product of all params to use itertools product
- Update requirements to add open-cv. This is needed for the caltech
birds dataset.
- Refactor the config to have a consistent use of the dict keys for
object creation and add a separate name field for identifying the
specific set of params associated with the object.

* Add user prompt to continue/exit benchmark run after run_grps are listed.

Update README to include example of filter pattern.

* make_eval_data now returns a new list of augmented examples instead of
updating the original list.

Remove return when user input == Y

* Set soft_margin default to True.

The default value was previously set to False but the doc string stated
the default value as True.

* Set nmslib to brute force search and remove agg results.

- Brute force search removes any noise introduced by an aprox NN search.
- Removing the agg results as we will provide a utility for aggregating
the result folders from each experiment.

* Update loss ids in the losses component.

- Removed the '_loss' suffix from the loss ids as it was redundent.
- Add xmb, triplet loss, and soft nn loss to the losses config section.

* Google Summer of Code (#286)

* Added multiple negatives ranking loss

* Added multimodal example

* Added support for multiple distances in mnrl loss

Added support for different distances in multiple negatives ranking loss

* Added link to multimodal example notebook

* black formatting

* Using numerically stable implementation of logsumexp

* Delete pyproject.toml

* Updated pyproject.toml

* Black formatting in multinegrank_loss

* Updated pip install url to dev branch

Co-authored-by: Owen Vallis <ovallis@google.com>

* resolve #299 Fix WarmupCosineDecay.

* Previous version scaled the cosine decay by a linear warmup value. So the max value was max_lr*0.5*(1+cos(warmup_steps/total_steps*pi))
* New version has a linear warmup and then begins the cosine decay from cos(0.0) so the max value is now max_lr.
* Previous version accepted a tensor of values, this is not needed. Simplified to accept a single scaler step value.
* Updated tests to be consistent with the keras LearningRateSchedule tests.
* Renamed class from WarmUpCosine to WarmupCosineDecay. This is more consistent with the Keras LearningRateSchedules.

* [nightly] Increase version to 0.17.0.dev12

* Update pn_loss default params and doc string formatting.

* Make soft_margin the default. The doc string stated this was the default but the param was set to False.
* Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance.
* Reformat the doc strings to align with the google docstring style.
* Add support for the pep585 annotations. Removed Callable and Union.

* Update triplet_loss default params and doc string formatting.
* Make soft_margin the default. The doc string stated this was the default but the param was set to False.
* Make the default margin 0.1. The previous value was 1.0 which produced sub-optimal results when using cosine distance.
* Reformat the doc strings to align with the google docstring style.
* Add support for the pep585 annotations. Removed Callable and Union.

* Update train to use new WarmupCosineDecay.

* Updates to config params for both prod and single configs
* Updates to component/losses to use the new defaults

* Benchmark updates and bug fixes

* calls to model.predict() now convert input to tensor using the CPU context. This avoids a mem leak when calling predict in a loop.
* expose all NMSLib params in NMSLibSearch. This enables custom parametrization of the nsmlib indexes.
* Update indexer to save and load Search objects from configs. Saving now works when passing a Search object to model.compile()
* Update the benchmark configs to use the unique name as the key and provide a 'component' field for identifying the component to build.
* Manually delete and clear all objects at the end of each benchmark loop to try and avoid memory leaks. However, we still leak mem in tf frameworks/ops.py
* Make flage ignore E206 whitespace after ":". This was in conflict with the black formatter.

* Enable manual trigger for testing workflow.

* Refactor benchmark datasets to be a class.
* Now supports creating custom Dataset objects to load other sources.  Currently supports loading TFDS data.
* Add support for define hyper_parameter versions of parameters to support KerasTuner search
* Split the single train script into create_dataset, hyper_parameter_search and train
* Update configuration files to remove the benchmark prefix.

* Add support for retrieval metrics in callback.

* Add support for R_precision and refactor map@k

* map_at_k is now a subclass of precision_at_k. Reduces code duplication.
* update names for precision_at_k and map_at_k when clip_at_k is set to
  true. Name no longer return an @k suffix but instead return wither
  R_Precision or map@R.

* distances now return their params when get_config() is called.

* Fix info string for mem samplers.

* Memory samplers now correctly report the number of augmenations in the sampler object.

* Fix mypy errors from newest mypy version.

* Migrate to support pep585 where possible
* Fix several small typing errors
* Provide typing for 100% of modules

* [nightly] Increase version to 0.17.0.dev13

* * GEM layers create a general pooling layer in the init, but we didn't
pass the kwargs. This means the general pooling layer didn't have the
dtype policy. This caused the GEM layers to fail when using a
mixed_float dtype policy as the general pooling layer returns float32
and the GEM dtype policy is float16.

The fix is to pass all kwargs onto the general pooling layer.

* Patch bump

* Cap the TF version at 2.9 for the current master branch.

* Resolves Error while using projector (#301)

* Resolves Error while using projector 

Since the new major release of Bokeh version 3.0, the `plot_width` (and `plot_height`) properties have been removed. These have been replaced by standard `width` and `height` for all layout-able models according to [Migration Guide](https://github.com/bokeh/bokeh/wiki/Migration-Guides#300). The update fixes the error generated by the `projector`.

* backward compatible

This update makes `projector` backward compatible with `bokeh`

* Apply formatting to projector changes to fix warnings from Black and ISort.

* [nightly] Increase version to 0.17.0.dev14

* Model Save Updates (#305)

* Update github workflow test matrix to include py 3.10 and tf 2.7 and 2.10

* Update github workflow to use py 3.11 and tensorflow 2.11.

* Fix testing error in test_schedules. import from keras should now be import from tensorflow.keras.

* The optimizer serialize and deserialize are under schedules in TF instead of the learning_rate_schedules module from keras.

* Turns out the workflow version must be < 3.11

* Python 3.10 requires TF >= 2.8.

* Fix and simplify Contrastive Model save and load.

* The old save and load manually loaded each inner model. This was
  required because we didn't build the outer model graph.
* The new solution uses a factory function to infer the input shape and
  then connect all the inner models and pass the input and output to the
  contrastive model. This is enough for the standard model save to work.
* Also silenced the INFO logs from nmslib.
* Small formatting and cleanup in other files.

* Remove extra print in EvalCallback

* Fix order of contrastive model metrics so that losses come first.

* Update unsupervised notebook to use new save and create functions.

* Fix formatting in contrastive model module

* [nightly] Increase version to 0.17.0.dev15

* Add MultiShotFileSampler (#307)

* Add MultiShotFileSampler

* Refactor MultiShotMemorySampler to use load_example_fn

* Update MultiShotFileSampler to inherit from MultiShotMemorySampler

* Fix typing errors in file_samplers

* Loss tests refactor (#308)

* Refactor the tests for the losses.

* Use the tf.TestCase packages
* Create utils for perfect and bad embedding examples.
* Refactor Triplet and PN Loss to have a single margin param now with
  float | None. None will now set the soft_margin.
* Replace basic tf.logsumexp with the TF sim stable logsumexp in the
  soft margin.
* Fix bug in semi-hard mining where we don't have any valid negatives >
  max positive. Previously this defaulted to selecting the example in
  idx 0. We now take the negative that is closest to the maximal
  positive without going over, i.e., max(d(a,n)) <= max(d(a,p)).
* Refactor the triplet loss tests.
* Create a losses dir under tests.

* Refactor the tests for the losses.

* Use the tf.TestCase packages
* Create utils for perfect and bad embedding examples.
* Refactor Triplet and PN Loss to have a single margin param now with
  float | None. None will now set the soft_margin.
* Replace basic tf.logsumexp with the TF sim stable logsumexp in the
  soft margin.
* Fix bug in semi-hard mining where we don't have any valid negatives >
  max positive. Previously this defaulted to selecting the example in
  idx 0. We now take the negative that is closest to the maximal
  positive without going over, i.e., max(d(a,n)) <= max(d(a,p)).
* Refactor the triplet loss tests.
* Create a losses dir under tests.

* Type empty_mask as BoolTensor to fix mypy error.

We create a matrix of zeros as dtype bool and vector of ones as dtype
bool, but mypy doesn't see these as BoolTensor type. This commit adds an
explicit BoolTensor type to the empty_mask to fix this.

* Fix formatting errors

* fix formatting errors

* [nightly] Increase version to 0.17.0.dev16

* Adding benchmark datasets component that was ignored due to datasets/ filter in the .gitignore

* Float16 (#310)

* Fix casting to use the default floatx where possible to avoid type
errors when training using mixed precision or float16.

* Update tests for supporting float16

* Remove float dtype parameterization of readme tests. They were too slow.

* Fix casting error when passing constant scalar. Set policy in multihead test to ensure we reset the policy to a good state.

* Remove duplicate long running test. This should speed up test by ~3min.

* [nightly] Increase version to 0.17.0.dev17

* Remove references to outputs in contrastive model. (#311)

* Remove references to outputs in contrastive model.

We use the inputs and outputs to support saving the contrastive model
using the Keras API, however, we override train and test steps as well as
predict. This means we don't currently support multiple output heads on
the embedding output. This PR removes all references to multi-headed
outputs and explicitly sets the indexer to use the predictor output.

* Provide default contrastive projector and predictor.

Users had to provide their own MLP models for the projector and
predictor. This required understanding more about the underlying
algorithms. This change now adds default projector and predictor models
based on the original papers.

* Update unsupervised colab.

* Comment out projector and predictor create model functions. We now
  automatically create the MLP models for users, but the commented code
  is left in case the user wants to customize them.
* Verify that the model trains and reloads.
* Loss and performance is slightly better than before.
* Update the create_contrastive_model function to pass a list of outputs
  to better track the outputs. The model still overrides the predict
  function though as we need to apply the L2 Norm at the output.

* Fix mypy error.

* Update ouput var name and use epsilon constant.

* [nightly] Increase version to 0.17.0.dev18

* Update release notes for 0.17.x

* Update example notebooks.

* Add patch to support passing custom NMSLibSearch objects.

* Add temporary fix that passes the search object config to the
  make_search function in order to support resetting the search index.
* NOTE: This is only temporary and a more general solution will be added
  in the new backend updates to search and store.
* Updated the supervised visualization notebook to demo using the custom
  NMSLibSearch object.
* Added warnings about the reset issues with custom objects in indexer.

* Remove the old benchmark dataset file.

* [nightly] Increase version to 0.17.0.dev19

* Update CLIP notebook and include search example.

* Remove benchmark code from release

* Set release version to 0.17.0

---------

Co-authored-by: Github Actions Bot <>
Co-authored-by: Christoffer Hjort <Christoffer.Hjort1995@gmail.com>
Co-authored-by: dewball345 <abhiraamkumar@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Genrry Hernandez <genrryhernandez@gmail.com>
Co-authored-by: Abhishar Sinha <24841841+abhisharsinha@users.noreply.github.com>
Co-authored-by: Emil Larsson <emla2805@users.noreply.github.com>
Co-authored-by: Abhishar Sinha <abhisharsinha@gmail.com>
Co-authored-by: Luke Wood <LukeWood@users.noreply.github.com>
Co-authored-by: Zoheb Abai <zohebabai@gmail.com>
Co-authored-by: Mohammad Amin Haghpanah <mdan.hagh@gmail.com>
Loading branch information
11 people committed Mar 19, 2023
1 parent ddd60ec commit 17ec76d
.flake8
-Original file line number
+Diff line change
@@ -1,2 +1,4 @@
     [flake8]
-    exclude = tmp.py, tests/
+    ignore = E203
+    exclude = tmp.py, tests/
+    max-line-length = 120
.github/workflows/test.yml
-Original file line number
+Diff line change
@@ Expand Up / @@ -13,7 +13,16 @@ jobs: @@
         strategy:
           fail-fast: false
           matrix:
-            python-version: [3.7, 3.8, 3.9]
+            include:
+              - python-version: '3.7'
+                tf-version: '2.7'
+              - python-version: '3.7'
+                tf-version: '2.11'
+              - python-version: '3.10'
+                # Python 3.10 only supports TF >= 2.8
+                tf-version: '2.8'
+              - python-version: '3.10'
+                tf-version: '2.11'
         steps:
         - uses: actions/checkout@v2
@@ Expand All / @@ -26,9 +35,13 @@ jobs: @@
             python -m pip install --upgrade pip
             pip install coveralls
-        - name: Install package
+        - name: Install dev packages
           run: |
-            pip install ".[tensorflow,dev]"
+            pip install ".[dev]"
+        - name: Install TF package
+          run: |
+            pip install tensorflow==${{ matrix.tf-version }}
         - name: Lint with flake8
           run: |
@@ Expand Down @@
.gitignore
-Original file line number
+Diff line change
@@ Expand Up / @@ -11,7 +11,6 @@ release.sh @@
     benchmark/supervised/datasets/
     benchmark/supervised/models/
     datasets/
-    models/
     # Byte-compiled / optimized / DLL files
     __pycache__/
@@ Expand Down @@
examples/README.md
-Original file line number
+Diff line change
@@ Expand Up / @@ -6,4 +6,5 @@ @@
     | [Hello World](./supervised_hello_world.ipynb) | Supervised | Train and use an image similarity model to find similar looking MNIST digits |
     | [Self-Supervised Learning](./unsupervised_hello_world.ipynb) | Unsupervised | Train an image model using the SimSiam based self-supervised contrastive learning. |
     | [visualization](./supervised/visualization.ipynb) | Supervised | Train an image similarity model on the Stanford Dogs dataset using Evaluation Callbacks and the interactive visualizer |
-    | [Sampler IO Cookbook](./sampler_io_cookbook.ipynb) | Utils | Examples demonstrating how to use the various in memory batch samplers.
+    | [Sampler IO Cookbook](./sampler_io_cookbook.ipynb) | Utils | Examples demonstrating how to use the various in memory batch samplers. |
+    | [CLIP finetuning](./multimodal_example.ipynb) | Supervised | Finetune CLIP on atric-dataset using multiple negatives ranking loss.
0 comments on commit `17ec76d`

Please sign in to comment.
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `17ec76d`

Commit

There are no files selected for viewing

0 comments on commit 17ec76d

0 comments on commit `17ec76d`