remove_first_principal_component for Smooth Inverse Frequency in Simple Sentence Similarity.ipynb #10

ziweiji · 2020-08-05T16:43:58Z

For Smooth Inverse Frequency in Simple Sentence Similarity.ipynb

In your code, merge sentences1 & sentences2 and remove_first_principal_component together.

        embeddings.append(embedding1)
        embeddings.append(embedding2)
embeddings = remove_first_principal_component(np.array(embeddings))

However, in original code of paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings" (https://github.com/PrincetonML/SIF/blob/master/src/sim_algo.py), the author calculate embedding1 and embedding2 (including remove_first_principal_component part) separately.

emb1 = SIF_embedding.SIF_embedding(We, x1, w1, params)
emb2 = SIF_embedding.SIF_embedding(We, x2, w2, params)

I wander if this difference influence the result considerably.

I am doing query task, so there is only one sentence in sentences1. Should I (1) merge query & answers and remove_first_principal_component together or (2) calculate embedding1 for query and embedding2 for answers separately or (3) save the svd of answers (sentences2) and then remove first_principal_component of sentences2 from weights of query (sentences1)?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove_first_principal_component for Smooth Inverse Frequency in Simple Sentence Similarity.ipynb #10

remove_first_principal_component for Smooth Inverse Frequency in Simple Sentence Similarity.ipynb #10

ziweiji commented Aug 5, 2020 •

edited

remove_first_principal_component for Smooth Inverse Frequency in Simple Sentence Similarity.ipynb #10

remove_first_principal_component for Smooth Inverse Frequency in Simple Sentence Similarity.ipynb #10

Comments

ziweiji commented Aug 5, 2020 • edited

ziweiji commented Aug 5, 2020 •

edited