Add cholesky and gradients #1492

ethanluoyc · 2019-01-10T11:06:49Z

This change is

googlebot · 2019-01-10T11:06:52Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

DirkToewe · 2019-01-10T19:54:29Z

@ethanluoyc Have You looked at PR#1356? It already contains a Cholesky Decomposition. Your implementation is a lot cleaner and uses TFJS methods as building blocks, which is a big upside. But as of yet, PR#1356 seems to be significantly faster. The Benchmark is done on CPU backend since PR#1492 runs faster there than on WebGL on my machine. Feel free to double check the benchmark since I'm known to make mistakes and mix things up.

P.S.: I hope its okay that I included Your code in the Benchmark. I specifically mentioned the source and exempted it from the license.

ethanluoyc · 2019-01-10T20:07:30Z

Hi @DirkToewe,

Thanks for letting me know your PR! I checked the issues page but didn't see any mentions of cholesky so I thought no one has tackled this.

The version currently I have here is WIP. One downside is indeed the memory issue. I think mine has not been optimized for that yet. I guess maybe we can work together on implementing cholesky and merge it into the codebase.

BTW, do you have the gradients implemented? I have a preliminary version which I think is correct (more testing needed still). Another thing I haven't implemented is the blocked version of the algorithm, same for the gradients. I need the differentiable version because I use it as part of my models (Gaussian processes).

Updates
@DirkToewe I took a closer look at your benchmarks. I appear to be using the algorithm from wikipedia? I am using the one mentioned in https://homepages.inf.ed.ac.uk/imurray2/pub/16choldiff/choldiff.pdf, which theoretically can be made faster by blocking and in-memory updates. I currently do not know the best way to implement them
idiomatically in tfjs but I guess maybe someone from tfjs can advise?

googlebot · 2019-01-10T20:07:32Z

CLAs look good, thanks!

DirkToewe · 2019-01-10T20:15:15Z

Yes gradients are implemented symbollically, see here. But it requires quite a few new operations (bandPart, triangularSolve, setDiag, ...) to be introduced, which is why the PR became so massive.

DirkToewe · 2019-01-10T21:31:17Z

Updates
@DirkToewe I took a closer look at your benchmarks. I appear to be using the algorithm from wikipedia? I am using the one mentioned in https://homepages.inf.ed.ac.uk/imurray2/pub/16choldiff/choldiff.pdf, which theoretically can be made faster by blocking and in-memory updates. I currently do not know the best way to implement them
idiomatically in tfjs but I guess maybe someone from tfjs can advise?

That's correct. It was the implementation of choice for me because:

it is simple and
it calculates elements in 64-bit before rounding to float32
(In NDJS, Kahan Summation was used to potentially improve float64 precision as well).

But there's no doubt there's room for performance improvements. If You want to do blocked operations, You're going to have to work with nested loops. As an example for that, take a look at the CPU MatMul implementation.

But a slight word of warning: From my experience, micro-optimizations will not yield as much performance as You may hope for.

Add cholesky and gradients

d59768c

ethanluoyc mentioned this pull request Jan 10, 2019

New Ops: Cholesky decomposition tensorflow/tfjs#1078

Closed

Add more tests and grad op

7d06e4c

DirkToewe mentioned this pull request Apr 14, 2019

Linear Algeba tensorflow/tfjs#1516

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cholesky and gradients #1492

Add cholesky and gradients #1492

ethanluoyc commented Jan 10, 2019 •

edited by dsmilkov

googlebot commented Jan 10, 2019

DirkToewe commented Jan 10, 2019 •

edited

ethanluoyc commented Jan 10, 2019 •

edited

googlebot commented Jan 10, 2019

DirkToewe commented Jan 10, 2019

DirkToewe commented Jan 10, 2019

Add cholesky and gradients #1492

Are you sure you want to change the base?

Add cholesky and gradients #1492

Conversation

ethanluoyc commented Jan 10, 2019 • edited by dsmilkov

googlebot commented Jan 10, 2019

What to do if you already signed the CLA

Individual signers

Corporate signers

DirkToewe commented Jan 10, 2019 • edited

ethanluoyc commented Jan 10, 2019 • edited

googlebot commented Jan 10, 2019

DirkToewe commented Jan 10, 2019

DirkToewe commented Jan 10, 2019

ethanluoyc commented Jan 10, 2019 •

edited by dsmilkov

DirkToewe commented Jan 10, 2019 •

edited

ethanluoyc commented Jan 10, 2019 •

edited