Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow training models with non-CGN PoS tags #53

Open
proycon opened this issue Jun 27, 2018 · 2 comments
Open

Allow training models with non-CGN PoS tags #53

proycon opened this issue Jun 27, 2018 · 2 comments
Assignees

Comments

@proycon
Copy link
Member

proycon commented Jun 27, 2018

Frog is currently quite tied to CGN (as we noticed in #52). I propose adding a parameter style for tagger in frog.cfg to indicate whether a PoS tagset is CGN-like (tagger.style = cgn) , i.e. it uses the HEAD(featurevalue,featurevalue) format and to also allow training models that are not in that style (tagger.style = simple?), in which case the resulting pos tags in the FoLiA would of course have no features at all at just be considered blobs. Additionaly, we could perhaps add another parameter value (tagger.style = cgn-full) for a more verbose CGN-style HEAD(subset=featurevalue,subset=featurevalue) (suggested by @JessedeDoes) which would eliminate a lot of the disambiguation problems we currently face.

@kosloot
Copy link
Collaborator

kosloot commented Jul 5, 2018

Well....
as #52 is solved for now, I see less urgency. I suggest waiting for a real use-case is best.
What probably is only needed, is to make sure that Frog can handle 'unstructured' tags, also without brackets and sub information. It probably already can. (or almost)
A more verbose format looks undesirable to me. More data, more parsing etc.

@proycon
Copy link
Member Author

proycon commented Jul 5, 2018

Yes, agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants