Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parts of Speech tagging? #147

Open
giorgio79 opened this issue May 28, 2018 · 5 comments
Open

Parts of Speech tagging? #147

giorgio79 opened this issue May 28, 2018 · 5 comments

Comments

@giorgio79
Copy link

Would love to do to POS tagging with this lib
Maybe integrate with others?
https://github.com/FinNLP/en-pos

@Yomguithereal
Copy link
Owner

Hello @giorgio79. There is an experimental version of the averaged perceptron used by spacy here. It's undocumented but it should work. On a side note, I am currently thinking of refocusing of fuzzy matching/clustering with this library and drop hard NLP tasks because I don't have much time. But I'd love to speak with you about what you thinks you'd prefer use this lib to perform POS tagging rather than using the one you mention here.

@giorgio79
Copy link
Author

Thx @Yomguithereal ! Js nlp libs are ripening super fast, I am currently evaluating myself the options, such as

Joining forces would be a great way forward to avoid duplicated efforts. Have you thought of combining with some of the others?
Otherwise, doing spacy in javascript sounds fantastic, but as you say a massive undertaking. At the moment, Natural seems to do a lot that I need already, and I just thought I give a quick go to others like Talisman.

@Yomguithereal
Copy link
Owner

Yomguithereal commented May 28, 2018

As much as I'd love to add my stone to js's hard nlp libraries I feel that my edge is much more fuzzy matching/clustering unfortunately. Google Refine-like stuff for instance & custom search engines.

@Yomguithereal
Copy link
Owner

Basically, my strategy for the future will probably to drop pos tagging / machine learning classifiers stuff and focus on fuzzy clustering, distance metrics, keyers, phonetic algorithms, stemmers, and tokenizers. But I'd be willing to help other libraries scavenge what they could use from me related to nlp such as the pos tagger, sentence tokenizer (punkt notably).

@giorgio79
Copy link
Author

Yeah, avoid reinventing the wheel where possible. Eg NaturalNode has tons of tokenizers already here https://github.com/NaturalNode/natural#tokenizers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants