Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index of start and end of entity #150

Open
grungert opened this issue May 3, 2023 · 3 comments
Open

Index of start and end of entity #150

grungert opened this issue May 3, 2023 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@grungert
Copy link

grungert commented May 3, 2023

Hi, an excellent tool for extraction. Is there a way for output to have a start and end index in the sentence? It would be a great improvement to highlight the inside sentence entity.

For example:
{'personal_info': [{'first_name': {text: string, start: integer, end: integer}, 'last_name': {text: 'Last name', start: int, end: int}''}]}

Thanks

@eyurtsev
Copy link
Owner

eyurtsev commented May 3, 2023

Hi @grungert , no way currently, but something I'm planning on trying to implement.

The solution I'm thinking on trying is to do a second pass with an LLM and ask the LLM to localize the extracted results from the raw text.

Also if you're interested on helping out with this feature let me know! :)

@eyurtsev eyurtsev self-assigned this May 3, 2023
@eyurtsev eyurtsev added the enhancement New feature or request label May 3, 2023
@smwitkowski
Copy link
Contributor

@eyurtsev are you set on using an LLM call to get the start and end index? Instead, you may be able to do a string search, assuming the entity extracted exists verbatim.

@eyurtsev
Copy link
Owner

@smwitkowski String search could be useful as well -- it doesn't have enough flexibility in comparison to an LLM, but it's fast and basically free. If you want to work on this feature let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants