Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Enhance Search #376

Open
ravensorb opened this issue Aug 2, 2015 · 17 comments
Open

Feature Request: Enhance Search #376

ravensorb opened this issue Aug 2, 2015 · 17 comments
Assignees

Comments

@ravensorb
Copy link
Contributor

It would be very useful to support more complex search/query functionality. For example, the ability to search on a int range + a date range + a loose text search. Or even better, switch to something that is a little more of a standard (and that other tools/systems can understant) -- like OData (http://www.odata.org/). The query syntax for OData is rather robust and includes concepts for "like", "between", "greater than", "less than", "contains", etc.

Note sure, but the Apache OLingo (https://olingo.apache.org/) might even be an option for a library to leverage?

@vigorouscoding
Copy link
Member

Hi shawn!

Have a look at https://docs.structr.org/REST-user-guide#Range%20Search

This example shows how to search for an integer range. The same concept works for dates (the dates need to be in iso format iirc).

Inexact search works like this: https://docs.structr.org/REST-user-guide#Inexact%20Search
The only thing is that you can not mix exact and inexact search. The whole query is either exact or inexact.

Hope this helps!

On 02 Aug 2015, at 17:36, Shawn notifications@github.com wrote:

It would be very useful to support more complex search/query functionality. For example, the ability to search on a int range + a date range + a loose text search. Or even better, switch to something that is a little more of a standard (and that other tools/systems can understant) -- like OData (http://www.odata.org/). The query syntax for OData is rather robust and includes concepts for "like", "between", "greater than", "less than", "contains", etc.

Note sure, but the Apache OLingo (https://olingo.apache.org/) might even be an option for a library to leverage?


Reply to this email directly or view it on GitHub.

@cmorgner
Copy link
Member

cmorgner commented Aug 2, 2015

Have a look at the Cypher query language (if you didn't do already), you can use Cypher Queries in Structr as well, see https://docs.structr.org/#Data%20Binding

@ravensorb
Copy link
Contributor Author

@cmorgner -- thanks I have looked at using cypher queries they are not very friends for REST clients (mobile applications, 3rd party integrations) that they go around the schema that is defined from a permissions perspective which makes them less than useful

@vigorouscoding I have indeed looked at the Ranged Search. The issue is, it is currently not possible to combine ALL of these into a single search. Right now, if I tried to do a range search AND a loose search at the same time -- it does not work.

Question -- is there a reason not to leverage some of the more defined standards for REST base URI syntaxes to make integration easier?

@ravensorb
Copy link
Contributor Author

Another thing that I have noticed -- if I specify "loose" for the search, I will almost always get back all of the entities a user has access to.

Ex:
https://localhost:8083/structr/rest/event/default?name=Some%20Person%20at%20this%20Location&loose=1

Returns a result of all 8 records of time "Event" in no specific order even when there is exactly 1 event with the specific name that I did the search on. Wouldn't it make sense to return the exact match in this case? or at least return it in search results priority order (closest match first)?

@cmorgner
Copy link
Member

cmorgner commented Aug 2, 2015

There is a common mistake you can make using curl and any kind of UNIX shell. You have to put the REST URL in quotes, otherwise the ampersand character is interpreted by the shell and the loose parameter will be ignored, maybe that is the culprit here?

@ravensorb
Copy link
Contributor Author

I am not sure I understand -- that was a URL from my browser not from curl.

@ravensorb
Copy link
Contributor Author

Any chance of a unified search in the near future? Basically one syntax that supports exact and loose search queries by using Lucine for everything?

@dlaske
Copy link
Contributor

dlaske commented Sep 7, 2015

That fix is in my pull request. The loose query accepts "not" and "range" queries, but i have to write tests.

#332

@amorgner
Copy link
Member

amorgner commented Sep 7, 2015

Thanks @ravensorb and @dlaske for your contribution, much appreciated!

We discussed the existing search functionality in the team, and we agree that there's much room for improvement, so just open doors here. Not only is the loose keyword awkward and should be replaced by something better to express loose/fuzzy search, the existing notation is also too limited at range/not/and/or/etc. search in general.

As we're currently implementing full-text search (based on Apache Tika), there are a lot of new additional functions and also requirements, and because we really like the new functionality to be available in the RESTful API as well, this should be another driver for a unified URL semantics for search/query-by-URL.

The OData semantics is really captivating, and AFAICT we could implement this while still supporting the current search/query functionality.

That said, this issue is a very fundamental topic, and I'd really like to be sure that we implement "the right thing", so let's try to discuss this as deep as necessary, and if we're sure about what we want and how to achieve this, then let's just start (don't get me wrong - it's a matter of days/weeks for discussion, and I'm optimistic that we could implement an additional search/query API in the 2.0 timeframe, which is Sept/Oct '15).

What do you think?

@ravensorb
Copy link
Contributor Author

I think this sounds excellent 😄 Is the idea of Tika providing support for entity extraction from structured formats (pdf, docx, png, jpg, etc), then feeding this to lucene for supporting both structured and unstructured indexing and then leveraging something like the OData URL $filter syntax for searching against the consolidated index?

@ravensorb
Copy link
Contributor Author

Thought I would see how this is going? Any progress to report?

@amorgner amorgner changed the title Feature Request: Enchance Search Feature Request: Enhance Search Sep 23, 2015
@amorgner
Copy link
Member

No news here, it's currently on hold due to project work.

@ravensorb
Copy link
Contributor Author

Just checking in on this request. A need for more robust search is becoming limiting factor for me (need and/or across multiple attributes, case insensitive, range + fuzzy search). Any chance this has made it back to the road map?

@cmorgner cmorgner self-assigned this Aug 31, 2016
@cmorgner
Copy link
Member

fyi: we're making good progress with the new Neo4j 3.0 Bolt driver which will improve the search capabilities, especially concerning complex / combined search queries on multiple attributes.

@ravensorb
Copy link
Contributor Author

interesting -- what is your thought on where bolt fits into search from a REST api perspective?

@cmorgner
Copy link
Member

cmorgner commented Sep 6, 2016

Bolt will enable us to do complex combined queries, i.e. range queries plus inexact string search on multiple attributes etc. because we're translating the REST queries into Cypher. I hope that the new snapshot will perform much better on such queries.

@ravensorb
Copy link
Contributor Author

Ahh, ok that makes sense -- I am all for improved performance :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants