Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Features to Sentencevectors #57

Open
oborchers opened this issue Dec 3, 2021 · 0 comments
Open

Add Features to Sentencevectors #57

oborchers opened this issue Dec 3, 2021 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@oborchers
Copy link
Owner

[ ] Sentencevectors:
Global:
[ ] Remove normalized vector files and replace with NN
ANN: --> (Annoy, with Option for Google ScANN?)
[ ] Only construct index when when calling most_similar method
[ ] Logging of index speed
[ ] Save and load of index
[ ] Assert that index and vectors are of equal size
[ ] Paramters must be tunable afterwards
[ ] Method to reconstruct index
[ ] How does the index saving comply with SaveLoad?
[ ] Write unittests?
Brute:
[ ] Keep access to default method
[ ] Make ANN Search the default?! --> Results?
[ ] Throw warning for large datasets for vector norm init
[ ] Maybe throw warning if exceeds RAM size of the embedding + normalization
Other:
[ ] L2 Distance
[ ] L1 Distance
[ ] Correlation (Power Score Correlation?)
[ ] Lookup-Functionality (via defaultdict)
[ ] Get vector: Not really memory friendly
[ ] Show which words are in vocabulary
[ ] Asses empty vectors (via EPS sum)
[ ] Z-Score Transformation from Power-Means Embedding? --> Benefit?

@oborchers oborchers self-assigned this Dec 3, 2021
@oborchers oborchers added the enhancement New feature or request label Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant