Add KL divergence terms for Latent SDEs #402

lockwo · 2024-04-17T21:05:20Z

Addresses #401. Revives #104. Based on that PR, I made the minimal requirements to get it up to current version (e.g. taking callables instead of ODE terms since we can't make these .vf becuase _broadcast_and_upcast requires that aug_y and drift(aug_y) are the same shape, but they aren't).

lockwo · 2024-04-17T21:06:31Z

Before going further (there is a lot I am going to improve/polish) I wanted to check with your thoughts on the general approach of KL being terms and exposing the user to a function that converts their problem. An alternative could be something like in torchsde where it's part of the intregration method, i.e. the user flags it at integration time.

patrick-kidger

On make these terms: I don't have super strong feelings, but compared to the original PR we have now more clearly defined what a term is in Diffrax, and I think there are other points on the design space.

To be precise: given a diffeq of the form

dy = f(y, z) da + g(y, z) db
dz = h(y, z) da + k(y, z) db

then this would be represented in Diffrax as

terms = (
    MultiTerm(f, g),
    MultiTerm(h, k),
)

In general: everything inside a MultiTerm(...) is all applied to the same dfoo. For example the SDE-specific solvers consume a MultiTerm[ODETerm, AbstractTerm], for the drift and diffusion.
Meanwhile the PyTree structure of terms themselves corresponds to different dfoo and dbar. For example semi-implicit Euler takes a pair of (AbstractTerm, AbstractTerm), corresponding to the two components that are being evolved.

In this case there is an argument that the extra KL-divergence term should really correspond to a new dfoo, and that as such the correct thing to do is to instead replace terms with (terms, kl_term), and then provide a wrapper solver which understands this alternative term structure.

diffrax/_kl_term.py

patrick-kidger · 2024-04-21T20:25:32Z

On the topic of Lineax: indeed, this should definitely make handling PyTrees much easier.

lockwo · 2024-04-24T01:10:04Z

I think your idea makes a lot of sense, and I made a fair amount of progress on the solver wrapper approach.

lockwo · 2024-04-27T06:30:44Z

Ok, I polished things up. I went with a sort of hybrid approach where the users specifies the SDEs as you described, then just wraps a solver and everything works smoothly. However, I did create internal terms, in order to get an arbitrary solver to integrate through the KL computation, that was the best way I could think of to do so, but they are completely hidden from the user. I also added the example (can be modified to add more text, or remove pmap although I do like having an example with distribution especially since its painfully slow without it) and a test and updated the docs. Taking it off draft now since its a real PR.

frankschae · 2024-05-08T21:57:34Z

This is a very cool feature/example! It looks like one needs to specify

levy_area=diffrax.BrownianIncrement

in diffrax.UnsafeBrownianPath

lockwo · 2024-05-08T23:54:06Z

Thanks @frankschae , good catch!

lockwo · 2024-05-10T18:30:32Z

The test failures are all just the safe map 0.4.27 stuff

kl

98c6922

lockwo mentioned this pull request Apr 17, 2024

Best way to approach KL divergence #401

Open

add more training lengths

3b4e07c

patrick-kidger reviewed Apr 21, 2024

View reviewed changes

diffrax/_kl_term.py Outdated Show resolved Hide resolved

lockwo changed the base branch from main to dev April 23, 2024 19:02

Merge branch 'patrick-kidger:main' into main

6200412

lockwo added 6 commits April 24, 2024 00:22

KL solver wrapper draft

085b68d

add saves

f603f38

minor fixes, more to come

97c1fc4

intermediate work

c9e1573

finalization for review

f21f26d

forgot saveat

2b56424

lockwo marked this pull request as ready for review April 27, 2024 06:26

_control term isn't recognized for some reason

633afbd

lockwo requested a review from patrick-kidger April 27, 2024 06:30

lockwo added 4 commits April 27, 2024 12:35

Merge branch 'dev'

6e34acd

fix test

7ea3127

3.9 fix

35af3e9

3.9 fix2

f9e40e1

lockwo force-pushed the main branch from 7e3a53f to f9e40e1 Compare May 8, 2024 23:56

a

1865057

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add KL divergence terms for Latent SDEs #402

Add KL divergence terms for Latent SDEs #402

lockwo commented Apr 17, 2024

lockwo commented Apr 17, 2024

patrick-kidger left a comment

patrick-kidger commented Apr 21, 2024

lockwo commented Apr 24, 2024 •

edited

lockwo commented Apr 27, 2024

frankschae commented May 8, 2024

lockwo commented May 8, 2024

lockwo commented May 10, 2024

Add KL divergence terms for Latent SDEs #402

Are you sure you want to change the base?

Add KL divergence terms for Latent SDEs #402

Conversation

lockwo commented Apr 17, 2024

lockwo commented Apr 17, 2024

patrick-kidger left a comment

Choose a reason for hiding this comment

patrick-kidger commented Apr 21, 2024

lockwo commented Apr 24, 2024 • edited

lockwo commented Apr 27, 2024

frankschae commented May 8, 2024

lockwo commented May 8, 2024

lockwo commented May 10, 2024

lockwo commented Apr 24, 2024 •

edited