Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Exact Searches #658

Open
1 of 2 tasks
AdamGaffney96 opened this issue Jan 28, 2024 · 5 comments
Open
1 of 2 tasks

[Feature Request]: Exact Searches #658

AdamGaffney96 opened this issue Jan 28, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@AdamGaffney96
Copy link

I've read the documentation

Your Feature Request

Is your feature request related to a problem? Please describe.

The issue I'm having is related to the search functionality. As you can see from the below screenshot, I have a bunch of seasons of a YouTube show downloaded. Unfortunately due to errors I was getting with playlist downloads specifically (I think related to the removal of the About section I've seen mentioned here in other places), I had to download all these episodes individually. What this means is that they're not easily grouped for me to just move through them one-by-one.
What I wanted to do was just search for "Game changer season 2" to get Season 2 of the show, as they're very consistent with their video titling, however seemingly due to a lack of exact searching, I'm getting a slightly annoying issue where other seasons are showing up. The issue is clearly that Episode 2 of all other seasons clearly have every keyword I'm looking for in them, so these pull up alongside actual Season 2, and even some other episodes of the show that just happen to have "2" in the title somewhere. I can't find anything in the docs that refer to exact searching, the way you would use quotation marks in Google to look for exact word order and phrasing.
I'm clearly looking for just Season 2, however the top results are Episode 2 of other seasons

Describe the solution you'd like

I'm listing this as a feature request as I'm just not sure if this is actually already built-in or not and I just can't find it. So ideally it's already in there and I'd like someone to just point me in the direction of it as I can't find it in the docs. Otherwise, I would like a way to search for an exact ordered list, perhaps using quotes "season 2" or just using another keyword like exact: season 2 for example.

Additional context

I am currently on v0.4.2 on TubeArchivist. I updated my docker-compose to try pull the latest, but I was getting errors and TubeArchivist was no longer loading so I've reverted back to my last working version until I get a chance to test that again. I'm hoping I'm just missing where the feature is listed in the docs, but if not it'd be great if it was implemented (and if it is implemented already in newer versions please do let me know).

Any help would be appreciated, thank you!

Your help is needed!

  • Yes I will work on this in the next few days or weeks.
@bbilly1
Copy link
Member

bbilly1 commented Feb 7, 2024

Current searching uses bool_prefix, to that doesn't matter in whitch order the terms are, that's why you see the results you see.

We could tweak the score and rank nearer matches higher, or implement as you suggest some exact matching to use phrase search instead.

Search will need pagination as a first priority, so additional improvements are quite far out.

@bbilly1 bbilly1 added the enhancement New feature or request label Feb 7, 2024
@kureta
Copy link

kureta commented May 19, 2024

I have a related but different problem. I can open a new issue if necessary. The problem is, when I search for the word "Laplace", first 8 results are "replace", "replacement", "replacing", the word "Laplace" comes as the 9th result, then another "replace", then the second match containing the word "Laplace". This doesn't make sense to me but I don't know anything about elastic search. I guess it is looking up using a word dictionary and fuzzy matches with the words in that dictionary are prioritized before exact matches with a non-existing word.

@bbilly1
Copy link
Member

bbilly1 commented May 19, 2024

I guess it is looking up using a word dictionary and fuzzy matches with the words in that dictionary are prioritized before exact matches with a non-existing word.

You can already configure fuzzy matching. Take a look at the docs.

@kureta
Copy link

kureta commented May 19, 2024

I see. Just checked the docs but this doesn't exactly solve the issue. I want fuzzy search. "Laplace" should also match "Laplacian" but as I increase fuzziness "replace" overtakes "laplace". I can understand "replace" overtaking "laplacian" but "laplace" is a perfect match, why does it fall below replace?

@bbilly1
Copy link
Member

bbilly1 commented May 19, 2024

This is a good starting point of you want to learn more how ES handles search relevance: https://www.elastic.co/what-is/search-relevance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants