Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some question #5

Open
edchengg opened this issue Sep 24, 2018 · 2 comments
Open

some question #5

edchengg opened this issue Sep 24, 2018 · 2 comments

Comments

@edchengg
Copy link

Hi Dawen,

Sorry to bother. I am not familiar with this specific task setting but I am interested in variational autoencoder application to this task. It will be really nice if you could help me understand this task when you are free.

As far as I understand, the VAE is trying to reconstruct the input(in this task, a vector of user-item which is like [0,0,0,1,0,1]. 1 indicates the user watched the particular movie and 0 indicates the user did not.) I see the logsoftmax layer at the end of decoder, so the output of VAE is a log probability distribution across all the items for one user.

My question is: for evaluation, how could the VAE make a prediction(pred_val) if the decoder is simply reconstruct the input from latent space? I think I misunderstand the vad_data_tr and vad_data_te.

for bnum, st_idx in enumerate(range(0, N_vad, batch_size_vad)):
            end_idx = min(st_idx + batch_size_vad, N_vad)
            X = vad_data_tr[idxlist_vad[st_idx:end_idx]]

            if sparse.isspmatrix(X):
                X = X.toarray()
            X = X.astype('float32')
        
            pred_val = sess.run(logits_var, feed_dict={dae.input_ph: X} )
            # exclude examples from training and validation (if any)
            pred_val[X.nonzero()] = -np.inf
            ndcg_dist.append(NDCG_binary_at_k_batch(pred_val, vad_data_te[idxlist_vad[st_idx:end_idx]]))

Thanks for you time.

@samlobel
Copy link

samlobel commented Oct 3, 2018

In this paper, the VAE also has a de-noising component. So, first you apply dropout to the input, and then try to recreate the un-corrupted input. That way the model learns to put probability into the items it does not know about.

@dawenl
Copy link
Owner

dawenl commented Oct 24, 2018

Dropout is important. But even without dropout, the decoder will try to reconstruct for any training input which means unless your model has infinite capacity to memorize every training data points which is pretty hard for such sparse high dimensional data, the model will learn to generalize to put probability mass to items that are relevant to the input. This idea is not new for autocoder or VAE, the same old matrix factorization works in the exact same way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants