Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected behavior: #union_cypher returns String object rather than Query object #308

Open
jorroll opened this issue Dec 15, 2017 · 9 comments

Comments

@jorroll
Copy link

jorroll commented Dec 15, 2017

While attempting to create a query involving the union of three sub-queries, I discovered that union_cypher expects a query object but returns a string rather than another query object. This prevents me from performing a union query on a union query. Given that all the other query methods return a query object, is there a reason for this behavior? I guess it kinda explains the name union_cypher

# @example
# # Generates cypher: MATCH (n:Person) UNION MATCH (o:Person) WHERE o.age = 10
# q = Neo4j::Core::Query.new.match(o: :Person).where(o: {age: 10})
# result = Neo4j::Core::Query.new.match(n: :Person).union_cypher(q)
#
# @param other [Query] Second half of UNION
# @param options [Hash] Specify {all: true} to use UNION ALL
# @return [String] Resulting UNION cypher query string
def union_cypher(other, options = {})
"#{to_cypher} UNION#{options[:all] ? ' ALL' : ''} #{other.to_cypher}"
end

How would you feel about a PR to either create a new query method union that returns another query object for further chaining?

Alternatively (and easier to implement), union_cypher could be updated so that it tests to see if its argument is already a string and, if so, simply inserts it without first calling to_cypher on the argument.

Thoughts?

@cheerfulstoic
Copy link
Contributor

I'd be fine with a union method that returns a Query object, though I have a lot of questions (which is part of why I didn't make it in the first place):

  • When you chain, do the new clauses go into the last query in the UNION or onto all queries in the UNION?
  • Would the method validate that the name number of columns exist in both queries (and maybe that they are the same names, I don't know what Neo4j's UNION requires)?

Generally, I think the main reason I didn't implement a UNION is because the underlying structure of Query doesn't feel like it jives with a Query representing a UNION

I think I'd be fine with union_cypher taking a string, but what about the idea of union_cypher taking multiple Query arguments (perhaps via a def union_cypher(*others, options = {}) definition)?

@jorroll
Copy link
Author

jorroll commented Dec 21, 2017

what about the idea of union_cypher taking multiple Query arguments (perhaps via a def union_cypher(*others, options = {}) definition)?

I like that 👍 .

Regarding a union query method, I see what you mean about confusing to implement. My main annoyance with union_cypher is that I need to put it into Neo4j::ActiveNode.query(). I can't simply call .exec or pluck() on it.

Maybe it would be best to leave a union() method out, but the implementation that comes to mind would be:

  1. Calling .union() on a query always adds a new union clause to the query. Once added, that union clause cannot be modified. Calling query_as(:n).match(n: {color: 'purple'}).union(query_as(:n).match(n: {color: 'blue'})) would be the same as query_as(:n).union(query_as(:n).match(n: {color: 'blue'})).match(n: {color: 'purple'}). I.e. additional query clause methods modify the original query, ignoring any union clauses.

  2. It looks like Neo4j expects all union returns to have the same columns (both name and number). So it would make sense to validate that and also: if you plucked the root query directly (.pluck()) to simply overwrite the return of the root and all unions so that they returned whatever was in the pluck.

Does this make sense? Thoughts?

@cheerfulstoic
Copy link
Contributor

Yeah, I would be fine with what you’re proposing, though I would warn that the clause logic can get a bit complicated if you were thinking about implementing this.

Another thought: union_pluck and union_return which do the same as union_cypher but pluck / return for you.

@jorroll
Copy link
Author

jorroll commented Dec 28, 2017

I see what you mean about the logic being confusing. I've got an implementation of .union and .union_all() query clause methods working in #309. Tests still need to be written and the code cleaned up, but I wanted to run it by you first.

I also tried to implement an improved union_cypher, before realizing that def union_cypher(*others, options = {}) wouldn't work, as options will never get a value. Not 100% sure if ruby would even let me put a splat before a regular argument like that. I decided not to pursue it further since I think .union & .union_all will provide better replacements.

@jorroll
Copy link
Author

jorroll commented Dec 29, 2017

PS: I've seen you mention a few times how you have mixed feelings about neo4j-core's automatic query clause ordering (occasionally necessitating the use of.break). I'll admit, upon first usage I was surprised by the feature and it took me a bit to figure out what was going on. This being said, once I figured out what was going on, I found the api to be easy to work with, and far more powerful than an api without query ordering. I've definitely made use of the feature in a few places, and it would be very tedious / annoying to replicate on my own (but it's very easy to work around with .break, on the occasions I don't want it). My two cents.

@cheerfulstoic
Copy link
Contributor

@thefliik That's fair, though I think there might be room for a hybrid implementation. Like certain clauses should combine, but not others. Like if you have multiple where method calls in order those should probably combine. But if there's a match in between, they shouldn't because WHERE is actually a modifier of MATCH (most people don't realize this and it causes strange issues because if you try do do MATCH ... WHERE ... MATCH ... WHERE ... and the second WHERE refers to a variable which was created in the first MATCH, you won't actually get filtering.

@cheerfulstoic
Copy link
Contributor

Something that I just thought of: It's all fine to be able to understand a tool once you've figured it out, but in order to increase adoption I think it's important to make sure that things are as easy as possible (without compromising the design, of course)

@jorroll
Copy link
Author

jorroll commented Jan 6, 2018

That's a great point. So as long as we're talking about "wouldn't it be great if." It'd probably be best if match clauses, by default, never re-arranged themselves, but there was an option to turn on re-arranging of some kind (something that more advanced users could discover).

@cheerfulstoic
Copy link
Contributor

Yeah, probably MATCH clauses shouldn't rearrange themselves. There's a trickier question, perhaps, in if sequential match calls should be combined into one MATCH clause, since there is actually a difference between specifying two MATCH clauses and putting the same clauses into a single MATCH and separating them with commas. See this SO question and it's answers for details:

https://stackoverflow.com/questions/34089692/what-does-a-comma-in-a-cypher-query-do

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants