Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement CP Decomposition for tensorflow.sparse.SparseTensor #168

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

amanj120
Copy link

Aims to solve issue #167

…ly, running the command tensorly.contrib.sparse.tests.test_decomposition.test_tf_sparse_cpd() should perform a quick test of the cp decomposition function
commit 2

fixed the testing frameowrk to be faster, now, after importing tensorly, running the command tensorly.contrib.sparse.tests.test_decomposition.test_tf_sparse_cpd() should perform a quick test of the cp decomposition function

All the necessary functions for CP Decomposition on tf.sparse.SparseTensor are added. I need some help on how to rebase this code to fit the tensorly guidelines
@amanj120
Copy link
Author

I am not sure how to refactor the code to work with the tensorly programming guidelines/model. Any help would be appreciated.

Running

from tensorly.contrib.sparse.tests.test_decomposition import test_tf_sparse_cpd
test_tf_sparse_cpd()

produces the correct output

@JeanKossaifi
Copy link
Member

Hi @amanj120,

Thanks for the PR, I will try to go through it as soon as I have some time. For reference, you want to define a new backend for the TensorFlow Sparse tensors, as is done for the PyData Sparse backend.

The functions from the backend are dynamically dispatch (in other words, every time you call a function from the backend, an intermediate function decorator checks the current backend and calls the function from that specific backend). What happens when you use the sparse context is that the functions you define in that backend will be called instead of the ones from the dense backend.



def sparse_mttkrp(tensor, factors, n, rank, dims):
values = tl.values(tensor)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all sparse backends should have a "values" and "indices" method

@coveralls
Copy link

coveralls commented May 4, 2020

Coverage Status

Coverage decreased (-3.04%) to 87.531% when pulling 9781086 on amanj120:master into 26189bf on tensorly:master.

@JeanKossaifi
Copy link
Member

Hi, sorry for the delay -- so in contrib.sparse, all the backends are sparse, so you only need tensorflow, not tensorflow.sparse. In other words, to use sparse tensors with tensorflow you just have to set the backend to tensorflow and use tensorly.contrib.sparse

Ideally all should then work transparently.

@meyerzinn
Copy link

What is the status of this PR? I'm hoping to use TensorLy for some large sparse PARAFAC decompositions and right now it's limited to CPU-only. Would TensorFlow support allow GPU acceleration?

@amanj120
Copy link
Author

amanj120 commented Jul 7, 2020

Hello,

I've been really busy since school ended, so I haven't had time to look at this PR in a while. @JeanKossaifi I will try to look into the changes you mentioned this coming weekend. @20zinnm The changes I've made so far are CPU-Only.

A GPU implementation for sparse PARAFAC might use Numba CUDA or some similar alternative, but that falls outside the scope of this project.

Copy link
Author

@amanj120 amanj120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeanKossaifi can you take a look at the comments I left?

Also, the Travis builds are failing because of a unit test that I slightly altered. The documentation for sparse.random suggests that the shape of the random factor matrices NOT be a tuple, so I changed the test code, but now I get a different error saying ValueError: cannot reshape array of size 2862 into shape (2862,14036). Do you know why this might be?

def indices(tensor):
""" Returns all the indices of non zero values of the tensor in COO
"""
raise NotImplementedError

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in order to have static indices and values methods for all sparse backends, I had to provide an implementation for the non-sparse backends too, which is why I am changing tensorly/backend/core.py

@@ -43,11 +43,11 @@ def to_numpy(tensor):

@staticmethod
def ndim(tensor):
return len(tensor.get_shape()._dims)
return len(tl.to_numpy(tensor).shape)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, the old implementation was failing for me. This might be a problem with the dynamic dispatching though. If it turns out my implementation of something else was wrong, I'll go back and fix this.



def sparse_mttkrp(tensor, factors, n, rank, dims):
with sparse_context():
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I need to have a with sparse_context(): for every method in this file. The reason I include this line is because I call the values and indices methods inside this method, and the implementations for those only exist if the backend is sparse.



def test_tf_sparse_cpd():
with sparse_context():
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing about with sparse_context: here.



class TensorflowSparseBackend(Backend):
backend_name = 'tensorflow'
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the backend_name be tensorflow or tensorflow.sparse? In numpy_backend the name is numpy.sparse, but if I set the name to tensorflow.sparse here, I get a ValueError: Unknown backend name 'tensorflow.sparse', known backends are ['numpy', 'tensorflow'] error.

@amanj120 amanj120 force-pushed the master branch 4 times, most recently from 6c80dbe to 99c6df2 Compare July 14, 2020 01:00
def tensor(data, dtype=np.float32, device=None, device_id=None):
if isinstance(data, tf.sparse.SparseTensor):
return data
elif isinstance(data, tuple):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This elif branch is necessary in order to create tensors without having to import tensorflow. Importing tensorflow is what was causing the Travis builds to fail.

@JeanKossaifi
Copy link
Member

Hey @amanj120! Did you manage to make it work?

@JeanKossaifi
Copy link
Member

Hi @amanj120 -- are you still working on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants