Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk Search & List Creator #7653

Open
4 tasks
Tracked by #3593
mekarpeles opened this issue Mar 15, 2023 · 8 comments · May be fixed by #9279
Open
4 tasks
Tracked by #3593

Bulk Search & List Creator #7653

mekarpeles opened this issue Mar 15, 2023 · 8 comments · May be fixed by #9279
Assignees
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Needs: Designs Primary Focus The active main quest of this contributor Priority: 2 Important, as time permits. [managed] Theme: Design Issues related to UI design, branding, etc. [managed] Theme: Lists Issues related to reading Lists Type: Epic A feature or refactor that is big enough to require subissues. [managed] Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]

Comments

@mekarpeles
Copy link
Member

mekarpeles commented Mar 15, 2023

This issue calls for a new feature called the Bulk Search that lets a patron submit blocks of plaintext to see if books can be found inside. For instance:

  1. Patrick Collison has a website https://patrickcollison.com/bookshelf which references books:
[How Judges Think](http://amazon.com/dp/0674028201)
[Aftermath](http://amazon.com/dp/0830642838)
[The Outbreak of the Peloponnesian War](http://amazon.com/dp/0801405017)
[The Rise and Fall of American Growth](http://amazon.com/dp/0691147728): The U.S. Standard of Living since the Civil War (The Princeton Economic History of the Western World)
  1. The Book Extractor will parse the text and see if it can identify book titles, isbns, urls, author names, or other useful book information.
  2. The tool then performs a search on Open Library using this data to find matching books.
  3. The patron then selects one or more matching Open Library books (choosing a desired edition, if relevant) and adds these books to one of their Open Library lists or reading log (e.g. Want to Read).

Related to list improvements #7390 and twitter borrowbot #3255

Demo by @cdrini:
https://codepen.io/cdrini/full/WNbjqRY

Tasks

  • Move demo into a page on Open Library, e.g. https://openlibrary.org/search/bulk
  • Formalize code using e.g. vue3
  • Add ability to pull text / find matches directly from url
  • Prototype a widget or chrome extension which lets a patron check for books on any webpage they're on
@mekarpeles mekarpeles added Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed] Type: Epic A feature or refactor that is big enough to require subissues. [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Theme: Design Issues related to UI design, branding, etc. [managed] Priority: 2 Important, as time permits. [managed] Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Lead: @DebbieSan Issues overseen by Debbie (Lists) [managed] Needs: Designs labels Mar 15, 2023
@mekarpeles
Copy link
Member Author

@DebbieSan @cdrini :)

@tfmorris
Copy link
Contributor

The technical terms for this (so people can find this issue) are Named Entity Recognition (for candidate identification) and Named Entity Linking (for resolving to specific OL item).

Presumably this will only be done for a small set of languages to start with. How do users lobby for their favorite language(s) to be included in that set?

@JaydenTeoh
Copy link
Collaborator

I am working on implementing NER for full text search in my gsoc proposal as well, perhaps we could collaborate on this issue! @cdrini @DebbieSan

@mekarpeles mekarpeles mentioned this issue Mar 15, 2023
1 task
@mekarpeles mekarpeles added the Theme: Lists Issues related to reading Lists label Mar 15, 2023
@DebbieSan
Copy link
Contributor

@tfmorris thank you for this. I'm not sure how implementation would go but I was thinking of starting with the 10 languages currently available in Open Library. However,I'm not sure how doable that is. It is definitely something essential to think about and find a solution.

@mekarpeles mekarpeles added this to the 2023 milestone Mar 20, 2023
@mekarpeles mekarpeles removed the Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] label Mar 20, 2023
@mekarpeles mekarpeles added Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] and removed Lead: @DebbieSan Issues overseen by Debbie (Lists) [managed] labels Sep 15, 2023
@mekarpeles mekarpeles changed the title Book Extractor Tool & List Creator Bulk Search & List Creator Oct 12, 2023
@mekarpeles mekarpeles modified the milestones: 2023, Sprint 2023-12 Nov 6, 2023
@lakshya-dhariwal
Copy link
Contributor

How about a bookmarklet or a iframe for the POC of "check for books on any webpage"

@Achorn
Copy link
Contributor

Achorn commented Dec 4, 2023

comment for assignee

@cdrini cdrini added Lead: @mekarpeles Issues overseen by Mek (Staff: Program Lead) [managed] and removed Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] labels Mar 4, 2024
@cdrini cdrini assigned mekarpeles and unassigned cdrini Mar 4, 2024
@benbdeitch
Copy link
Contributor

This seems like a somewhat daunting set of tasks, but I'd be thrilled to take it on. Couldyou assign me to it?

@mekarpeles mekarpeles assigned benbdeitch and unassigned mekarpeles Mar 18, 2024
@mekarpeles mekarpeles added Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] and removed Lead: @mekarpeles Issues overseen by Mek (Staff: Program Lead) [managed] labels Mar 18, 2024
@benbdeitch
Copy link
Contributor

Hello! Could you assign me to this project, for my fellowship? I'd love to get this up and rolling.

@mekarpeles mekarpeles added the Primary Focus The active main quest of this contributor label May 10, 2024
@benbdeitch benbdeitch linked a pull request May 16, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Lead: @cdrini Issues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed] Needs: Breakdown This big issue needs a checklist or subissues to describe a breakdown of work. [managed] Needs: Designs Primary Focus The active main quest of this contributor Priority: 2 Important, as time permits. [managed] Theme: Design Issues related to UI design, branding, etc. [managed] Theme: Lists Issues related to reading Lists Type: Epic A feature or refactor that is big enough to require subissues. [managed] Type: Feature Request Issue describes a feature or enhancement we'd like to implement. [managed]
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants