Skip to content

Commit

Permalink
tweets to flits
Browse files Browse the repository at this point in the history
  • Loading branch information
veekaybee committed Jul 24, 2023
1 parent 45c1338 commit f10f188
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion embeddings.tex
Original file line number Diff line number Diff line change
Expand Up @@ -1038,7 +1038,7 @@ \subsubsection*{Embeddings as larger feature inputs}
\end{tikzpicture}


The factorization of our feature matrix into these two matrices, where the rows in Q are actually embeddings \citep{levy2014neural} for users and the rows in matrix P are embeddings for Tweets, allows us to fill in values for flits that Flutter users have not explicitly liked, and then perform a search across the matrix to find other words they might be interested in. The end-result is our generated recommendation candidates, which we then filter downstream and surface to the user because the core of the recommendation problem is to recommend items to the user.
The factorization of our feature matrix into these two matrices, where the rows in Q are actually embeddings \citep{levy2014neural} for users and the rows in matrix P are embeddings for flits, allows us to fill in values for flits that Flutter users have not explicitly liked, and then perform a search across the matrix to find other words they might be interested in. The end-result is our generated recommendation candidates, which we then filter downstream and surface to the user because the core of the recommendation problem is to recommend items to the user.

In this base-case scenario, each column could be a single word in the entire vocabulary of every flit we have and the vector we create, shown in the matrix frequency table, would be an insanely large, sparse vector that has a $0$ of occurrence of words in our vocabulary. The way we can build toward this representation is to start with a structure known as a \textbf{bag of words}, or simply the frequency of appearance of text in a given document (in our case, each flit is a document.) This matrix is the input data structure for many of the early approaches to embedding.

Expand Down

0 comments on commit f10f188

Please sign in to comment.