Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pos-tags, "muss" #2

Open
mxi-hug opened this issue Dec 6, 2018 · 1 comment
Open

pos-tags, "muss" #2

mxi-hug opened this issue Dec 6, 2018 · 1 comment

Comments

@mxi-hug
Copy link

mxi-hug commented Dec 6, 2018

small observation on pos-tags:

count("GERMAPARL", query = '"muss"', cqp = T, breakdown = T, p_attribute = "pos")

results in ~30k pos = "NN", which is about 20% of all hits.

As I'm not familiar with the pos-tagger, i've no idea, whether it is possible or feasible to optimize the results...

@ablaette
Copy link
Collaborator

This issue is old ... but still relevant. I just inspected the the NN-tagged "muss" matches using this snippet:

k <- kwic("GERMAPARL", query = '[word = "muss" & pos = "NN"]', cqp = T)

I would hope that our pos tagger performed better than that. We have started to use StanfordNLP - and need to inspect these POS tags.

@ablaette ablaette reopened this Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants