New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(spanner/spannertest): Support generated columns #4742
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this change, I think it is a good step in the right direction, but there are a couple of things that need some looking into:
- I get two test failures when I run the tests locally. One seems to simply be a leftover
t.Fail()
call in one of your tests. The other is in an existing test. - I think it would be better to try to integrate the check which columns a generated column depends on with the existing expression evaluation. This reduces the chances of having differences between the two.
spanner/spannertest/db.go
Outdated
break | ||
} | ||
} | ||
// We skip this column if any of its dependencies are null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do that?
spanner/spannertest/db.go
Outdated
row: row, | ||
} | ||
|
||
// TODO: We would need to do a topoligical sort on dependencies to ensure we can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
// TODO: We would need to do a topoligical sort on dependencies to ensure we can | |
// TODO: We would need to do a topological sort on dependencies to ensure we can |
@@ -634,6 +683,10 @@ func (t *table) addColumn(cd spansql.ColumnDef, newTable bool) *status.Status { | |||
// TODO: what happens in this case? | |||
return status.Newf(codes.Unimplemented, "can't add NOT NULL columns to non-empty tables yet") | |||
} | |||
if cd.Generated != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This check should be extended to check whether the table is empty or not. Adding a generated column to an empty tables should be safe in this case.
This PR should also add a check to dropColumn
to prevent a column that is a dependent of a generated column from being dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already checking the table is non empty, as for checking if its a dependency of a generated column do you still think its worth adding that as we talked about dropping all dependency calculations above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed that it was already checking it.
As for the dependency checking:
- I don't think you need any dependency checking here.
- We should preferably use the dependency checking in
dropColumn
to prevent columns that are used by a generated column from being dropped. But preferably that dependency checking can be done using the method that also does the evaluation, so that we don't have to duplicate that code. Would it for example be possible to use default values for the evaluation context when we only want to check the dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that definitely sounds like the right approach, having a way to run the evaluator in with default values in some "complete" way (i.e. we'd have to check in an OR what is required to evaluate the RHS even if the LHS evaluates to true). It feels out of scope for me right now, given it will be a fair bit of extra work to get going well and I'm not overly confident with the evaluation code and typing. For now I'll just come up with the solution for adding columns as any other features should be able to be built on top without changes so doesn't feel wasteful
spanner/spannertest/db.go
Outdated
@@ -1090,3 +1147,40 @@ func parseAsDate(s string) (civil.Date, error) { return civil.ParseDate(s) } | |||
func parseAsTimestamp(s string) (time.Time, error) { | |||
return time.Parse("2006-01-02T15:04:05.999999999Z", s) | |||
} | |||
|
|||
// getIDsForGeneratedExpression returns a list of column names the expression depends on | |||
func getIDsForGeneratedExpression(e spansql.Expr) []spansql.ID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the best way to do this from a maintenance and completeness perspective. Instead I think it might be better to edit the evalContext.evalExpr
function to also return the id's that the expression needs to be evaluated. That would keep the logic for what is needed for a certain expression in one place, and reduce the chance of missing something, and also ensure that whenever we add support for new expressions, those are also automatically included in this array.
This implementation currently does not detect any dependencies on other columns if they are used in:
- Function calls
- Array expressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I really didn't like this approach, it felt like it was prone to error and future failure but couldn't figure out something smarter. I think your suggestion sounds solid I can check into that, So adding a 3rd return argument to evalExpr, potentially returning and error and the list of dependencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like that was my thought as well.
spanner/spannertest/db_test.go
Outdated
rows := slurp(t, iter) | ||
// 2 * A * B = 2 * 3 * 3 | ||
if rows[0][1].(int64) != 18 { | ||
t.Fatal("Generated value for C should have been 18") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use got/want format, so something like Generated value for C mismatch\n Got: %v\nWant: %v
spanner/spannertest/db_test.go
Outdated
t.Fatal("Generated value for C should be nil") | ||
} | ||
|
||
t.Fail() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this belongs here.
if rows[0][1] != nil { | ||
t.Fatal("Generated value for C should be nil") | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add tests for update and delete?
@@ -52,6 +52,14 @@ var functions = map[string]function{ | |||
return strings.HasPrefix(s, prefix), spansql.Type{Base: spansql.Bool}, nil | |||
}, | |||
}, | |||
"LOWER": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change belongs in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can move it to a separate one, just smuggled it in here as I was testing with it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok. I would prefer to have it in a separate PR, unless it is actually used in one of the test cases (I don't think it is at this moment, right?)
@@ -361,6 +362,83 @@ func TestConcurrentReadInsert(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestGeneratedColumn(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we also add tests for:
- Adding a generated column to an existing table.
- Altering a generated column.
- Dropping a column that is a dependant of a generated column.
777d73d
to
1469ccd
Compare
Thanks for the quick review, I think that suggestion is great, I was hoping someone could offer a better suggestion! |
@olavloite I did a rewrite dropping all the column dependency stuff, pretty happy with it now, I don't see why that can't be added on top later when we want to support dropping columns that may be dependencies or resolving co-dependent generated columns in a reliable way. It felt like a tricky problem to solve though, needing default values and ensuring complete execution etc, a good rewrite of the db_eval stuff. |
spanner/spannertest/db.go
Outdated
if col.Generated != nil { | ||
res, err := ec.evalExpr(col.Generated) | ||
if err != nil { | ||
// We assume that if the expression ended up with a type error a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into the evaluation of the different expressions, and the problem that you are running into is that those evaluations do not take into account that some of the input parameters could be NULL. So I think that this is a reasonable workaround for this for now, but we should probably try to fix the evaluation of expressions that may return null values and then remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly, its very much a best effort but dumb approach. Cheers
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
@googlebot I consent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@psytale The build error is caused by the fact that the implementation uses Feel free to let me know if you want me to look into it for you. |
* No longer returning code.InvalidArgument when calling db_eval.evalFunc but that seems consistent with the rest of the module, no tests failed so I assume its not behaviour relied upon, we could check for the error type instead
Think I got something going with it but feel free to push any changes if you want to do it in a different way, I'm not overly confident with go. I also fixed (and added a test case) for functions and null columns, I wasn't happy with removing the codes.InvalidArgument but felt it was okay, shout if you think of a better way to do it (or feel free to push commits) |
@psytale I decided to remove the extra error wrapping and checking, and instead add NULL checks to the most used evaluations. That should ensure that generated columns that evaluate to NULL will (in most cases) work as intended. The cases where they don't are not really related to the generated columns implementation, but to the fact that those evaluations would fail anyways before this change. So if you were to for example do a @hengfengli Would you mind taking a quick look at this as well? |
spanner/spannertest/db.go
Outdated
// generated columns with fresh data. | ||
pk := r[:t.pkCols] | ||
rowNum, found := t.rowForPK(pk) | ||
// this should never fail as the row was just inserted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this
-> This
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the missing period at the end ... inserted.
spanner/spannertest/db.go
Outdated
row: row, | ||
} | ||
|
||
// TODO: We would need to do a topological sort on dependencies (i.e. what other columns the expression references) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a convention to limit the line length of comments to 80 characters.
spanner/spannertest/db.go
Outdated
@@ -643,6 +677,9 @@ func (t *table) addColumn(cd spansql.ColumnDef, newTable bool) *status.Status { | |||
Name: cd.Name, | |||
Type: cd.Type, | |||
NotNull: cd.NotNull, | |||
// TODO: We should figure out what columns the Generator expression relies on and check validate it at this time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please well format the comment.
t.Errorf("Read rows failed: %v", err) | ||
} | ||
|
||
// Great writer has nil because NumSongs is nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should be Poor writer
, right?
@olavloite I saw some of your comments in the tests are not resolved yet. Do you think they should be fixed before merging? |
I've added the missing error message and missing delete test. My comment on tests for dropping and altering columns is void, as we have removed the check altogether and added it to the list of limitations in the README. So as far as I'm concerned, this is now good to be merged once CI is green. |
No description provided.