Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

present participle identified as a gerund #19

Open
brandondrew opened this issue Jan 28, 2023 · 1 comment
Open

present participle identified as a gerund #19

brandondrew opened this issue Jan 28, 2023 · 1 comment

Comments

@brandondrew
Copy link

In the following example, "attaching" is not a gerund, but EngTagger identifies it as one:

irb(main):009:0> text = "I'm attaching a flyer with our information."
=> "I'm attaching a flyer with our information."
irb(main):010:0> tagged_text = tagger.add_tags(text)
=> "<prp>I</prp> <vbp>'m</vbp> <vbg>attaching</vbg> <det>a</det> <nn>flyer</nn> <...
irb(main):011:0> readable_tagged_text = tagger.get_readable(text)
=> "I/PRP 'm/VBP attaching/VBG a/DET flyer/NN with/IN our/PRPS information/NN ./PP"
irb(main):012:0>

"Attaching" is a present participle—the main verb of the sentence. In order to be a gerund, it would need to act as a noun, for example in the following sentence:

Attaching paper clips to small piles of papers is very boring.

@yohasebe
Copy link
Owner

You are right, engtagger's parsing is often not very correct; it is a port of Perl's Lingua::EN::Tagger library, so problems in the statistical data provided by the original library are still present in engtagger.

So I have created another rubygem for better English sentence analysis in Ruby. Try ruby-spacy. It requires Python's SpaCy library to be installed on your system, but is more accurate and richer in features.

That said, I think engtagger has its own merits: it allows you to quickly check the part of speech of words without having to build a tool chain involving Python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants