Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo embeddings #218

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Demo embeddings #218

wants to merge 4 commits into from

Conversation

kwalcock
Copy link
Member

No description provided.

@kwalcock
Copy link
Member Author

@RazvanDu, your embeddings file was serialized and sent to artifactory.clulab.org. This PR shows how it can be accessed, but does not yet implement a grounder. Generally processors does not do the grounding; that needs to go in a different project, so I just put it here for now.

@RazvanDu
Copy link
Contributor

@RazvanDu, your embeddings file was serialized and sent to artifactory.clulab.org. This PR shows how it can be accessed, but does not yet implement a grounder. Generally processors does not do the grounding; that needs to go in a different project, so I just put it here for now.

Cool, got it.

}

object HabitusGrounder {
// TODO: A simple "deberta-base" fails! With v3, the special character is incorrect.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RazvanDu, for your embedding_generator/main.py in processors, did you literally use the "microsoft/deberta-base" that is in the comments or something else when producing your deberta-embd.tsv? The deberta-base does not seem to be available at the rust level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That weird, yes, I used the exact values in the comments/example commands and they work just fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some tokenizers that are only directly available to Python, but there is some way to export them. I just have to remember how that is done. Thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that makes sense, gotcha.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants