Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling Nested Entities #57

Open
jpcorb20 opened this issue Jul 13, 2022 · 3 comments
Open

Handling Nested Entities #57

jpcorb20 opened this issue Jul 13, 2022 · 3 comments

Comments

@jpcorb20
Copy link

Hello,

First, thanks for the great lib!

I was wondering if you could confirm if the metrics can handle well the nested entities as well as flat ones (i.e. entity spans inside one or many other entity spans) ? In the examples, we only see flat entities. My guess is that the strict metric should hold for both cases.

Thanks in advance

@ivyleavedtoadflax
Copy link
Collaborator

Hi @jpcorb20 thanks for raising an issue. Could you provide a quick example?

@jpcorb20
Copy link
Author

jpcorb20 commented Jul 14, 2022

Hello @ivyleavedtoadflax,

thanks for the quick reply!

So, an example of nested entities would be:

Joe Smith, president of the United States, went to Toronto, Canada, in February 2009.
(18 tokens with punctuations, white space delimited)

( ( Joe )FIRSTNAME ( Smith )LASTNAME )PERSON, president of the ( ( United States )COUNTRY )LOCATION, went to ( ( Toronto )CITY, ( Canada )COUNTRY )LOCATION, in ( ( February )MONTH ( 2009 )YEAR )DATE.

List of entities that would be refence :

entity_reference_list=[
{"label": "PERSON", "start": 0 , "end": 1},
{"label": "FIRSTNAME", "start": 0 , "end": 0},
{"label": "LASTNAME", "start": 1 , "end": 1},
{"label": "LOCATION", "start": 6 , "end": 7},
{"label": "COUNTRY", "start": 6 , "end": 7},
{"label": "LOCATION", "start": 10 , "end": 11},
{"label": "CITY", "start": 10 , "end": 10},
{"label": "COUNTRY", "start": 11 , "end": 11},
{"label": "DATE", "start": 16 , "end": 17},
{"label": "MONTH", "start": 16 , "end": 16},
{"label": "YEAR", "start": 17 , "end": 17}
]

As you can see in this example, you can have many entities on the same span as well as sub-spans nested inside larger ones.

@ivyleavedtoadflax
Copy link
Collaborator

Thanks @jpcorb20 for clarifying. So this isn't a use case we have tried before with the package. I'd really want to write some tests to confirm that it works as expected. We're not actively developing the package at the moment though, so I can't give you a timeline for when that might be. If you want to put in a PR yourself with some new tests, we can help you test and merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants