Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HHH-16531 HHH-18004 ColumnDefinitions should respect @Column(columnDefinition) #8137

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

quaff
Copy link
Contributor

@quaff quaff commented Apr 10, 2024

@quaff quaff marked this pull request as draft April 10, 2024 05:33
@quaff quaff force-pushed the HHH-16531 branch 3 times, most recently from 7d4ff56 to 4eb1925 Compare April 10, 2024 08:08
@quaff quaff marked this pull request as ready for review April 10, 2024 08:44
String required = new StringTokenizer( normalize( column.getTypeName() ), "( " ).nextToken();
String actual = normalize( columnInformation.getTypeName() );
return actual.equals( required );
}
Copy link
Member

@gavinking gavinking Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the purpose of this code, and naively it doesn't look like it should be necessary.

It should already be taken care of by the existing:

normalize( stripArgs( column.getSqlType( metadata ) ) ).equals( normalize( columnInformation.getTypeName() ) );

Since column.getSqlType( metadata ) returns the sqlTypeName if there is one, and since stripArgs(x) does the same thing as new StringTokenizer(x , "( " ).nextToken().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should already be taken care of by the existing:

normalize( stripArgs( column.getSqlType( metadata ) ) ).equals( normalize( columnInformation.getTypeName() ) );
  1. It should return false immediately instead of fallback if @Column(columnDefinition) present.
  2. stripArgs() doesn't strip not null.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if that's the case then better fixes would be:

  1. Add else if (column.getSqlTypeName()!=null) { return false; } which is much simpler.
  2. Change the stripArgs() to handle the additional cases you're concerned about.

But the thing is I just don't see how this can be made robust. Your implementation reduces character varying (10) not null to just character, which is a different type. So that's no good. Now, sure, you could look for not null explicitly and remove it, but there's many more complicated things apart from not null that this freeform columnDefinition could contain.

I mean, if you can think of something, I'm all ears, but just "take the first token" definitely isn't good enough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added dedicated method extractTypeName(String typeExpression) to handle this.

@quaff quaff marked this pull request as draft April 10, 2024 09:50
@gavinking
Copy link
Member

OK, this LGTM now. Thanks!

@quaff
Copy link
Contributor Author

quaff commented Apr 10, 2024

OK, this LGTM now. Thanks!

Updated, fix that column not updated if columnDefinition added and differs to inferred type but the inferred type is equals to actual type in database, please review again.

boolean typesMatch = dialect.equivalentTypes( column.getSqlTypeCode(metadata), columnInformation.getTypeCode() )
|| normalize( stripArgs( column.getSqlType( metadata ) ) ).equals( normalize( columnInformation.getTypeName() ) );
|| sqlTypeNameEquals;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean after this latest change you are now optimizing for the uncommon case, putting all the string manipulation up front. I have no idea if it's a significant performance hit, most probably not, but I did prefer the previous version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean after this latest change you are now optimizing for the uncommon case, putting all the string manipulation up front. I have no idea if it's a significant performance hit, most probably not, but I did prefer the previous version.

Previous version doesn’t update existing data type if columnDefinition added, I will add tests to cover it tomorrow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous version doesn’t update existing data type if columnDefinition added

Ok, so let's take a half-step back here. Consider:

@ColumnDefinition(columnDefinition="character varying(255) not null default '-'")
String name;

Schema Update, as it's written today, simply can't do anything to help with things like this. The only things it knows how do deal with are column types and lengths. And to do any more that, it would need to be able to parse and understand the native DDL of eleven different databases.

I think you're thinking that people use columnDefinition to override just the column type and length. And perhaps some poor misguided souls do indeed use it that way. But that's something we tell them not to do.

So I don't think we should go beyond checking that the SQL column type and length is sensible. We can't validate that the whole columnDefinition is respected, because JDBC metadata doesn't give us access to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long version;

First run, It's expected that version column added as bigint.

Now I want it be integer but keep java type as Long.

@ColumnDefinition(columnDefinition="integer")
Long version;

Second run, It detect Long equals to bigint, and schema will not be updated, that's the behavior of previous version, I think it's a bug, WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but this is not a way of using columnDefinition that we encourage or even really support.

For this we tell you to use @JdbcTypeCode(INTEGER), so that we can:

  • generate DDL that is portable across different databases,
  • correctly add not null constraints and sometimes other things, and
  • perform any appropriate type conversions on the Java side, and use the correct methods when writing to/from JDBC (not a problem in this particular very trivial case, but important in general).

And the issue is that since columnDefinition is not a SQL type name, but a fragment of native DDL from one ~10 different dialects of SQL, even just recognizing that all we have here in your case is a type name is a nontrivial problem.

I mean, really it's a failing of JPA itself, which I guess should let @Column specify a columnType from the range of well-defined JDBC types, similar to what our @JdbcTypeCode does. Back in the day this was not added to the spec because TopLink didn't support it, but perhaps EclipseLink does now (I have no idea).

Now, what could we do? Well, in principle, I suppose, we could add support to the Dialects for parsing and interpreting columnDefinitions, something sorta similar to what you proposed here earlier, but obviously it would need to be much more robust. And then we could build that into Column itself, so that your code:

@Column(columnDefinition="integer")
Long version;

would be interpreted as if you had written:

@JdbcTypeCode(INTEGER)
@Column(columnDefinition="integer")
Long version;

and:

@Column(columnDefinition="decimal(10,2) not null")
Long version;

would be interpreted as if you had written:

@JdbcTypeCode(DECIMAL)
@Column(columnDefinition="decimal(10,2) not null", 
        nullable=false, precision=10, scale=2)
Long version;

So how hard would it be to implement this? Well, not necessarily very hard I suppose. There are some type names which are a bit nontrivial to parse, e.g. timestamp(6) with time zone, char(128) for bit data, etc, and lots of dialect-specific variation in naming, e.g. timestamptz, varchar2, etc, but it's not difficult work, just a bit of tedium involved in going through all the databases and compiling all the information.

I think it would be a very nice thing to have, to be honest, but I'm not motivated to lose a week of my life to this.

What I'm saying is: if you want to do this, do it properly, instead of hacking in something that sorta works in only the most trivial cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, but columnDefinition is JPA specification and @JdbcTypeCode is Hibernate specific that most people like me doesn't even know.

I mean after this latest change you are now optimizing for the uncommon case, putting all the string manipulation up front. I have no idea if it's a significant performance hit.

Please review latest commit, It should avoid string manipulation for common case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is that I don't want to do anything here unless we do it properly. In particular I don't want to make this code more complex than it already is just to handle 1% of possible cases. That's just technical debt.

If you really believe that this is important, then it should be done according to something like the approach I've outlined.

@quaff
Copy link
Contributor Author

quaff commented Apr 16, 2024

JDBC type names and codes are chaos, It wasted my several days, this may be my final attempt, I chose another safer way that differentiate updating from validating schema which should ignore user-specified definition, It will minimize impact.

Meanwhile I added a dedicate field columnDefinition instead of existing sqlType to store user-specified definition, It's not required(not confirmed) but worthwhile to not mix user-specified and inferred one.

WDYT @gavinking

@quaff quaff changed the title HHH-16531 ColumnDefinitions should respect @Column(columnDefinition) HHH-16531 HHH-18004 ColumnDefinitions should respect @Column(columnDefinition) Apr 24, 2024
@quaff
Copy link
Contributor Author

quaff commented Apr 24, 2024

@quaff quaff marked this pull request as ready for review April 24, 2024 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants