Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NumberFormatException when parsing OID values exceeding 32-bit signed integer range #1068

Open
dallinhuff opened this issue Apr 9, 2024 · 5 comments

Comments

@dallinhuff
Copy link

Description:

Currently, Typer uses Integer.parseInt() for parsing PostgreSQL OIDs. However, PostgreSQL OIDs are 32-bit unsigned integers, and using Integer.parseInt() can lead to a NumberFormatException for values that exceed the range of a 32-bit signed integer.

Stack Trace

skunk.exception.DecodeException: 
🔥  
🔥  DecodeException
🔥  
🔥    Problem: Decoding error.
🔥     Detail: This query's decoder was unable to decode a row of data.
🔥  
🔥  The statement under consideration was defined
🔥    at «skunk internal»:0
🔥  
🔥    SELECT oid typid, typname, typarray, typrelid
🔥    FROM   pg_type
🔥    WHERE typnamespace IN (
🔥      SELECT oid
🔥      FROM   pg_namespace
🔥      WHERE nspname = ANY(current_schemas(true))
🔥    )
🔥  
🔥  The row in question returned the following values (truncated to 15 chars).
🔥  
🔥    typid     oid   ->  4294967108    ├── java.lang.NumberFormatException (see below)
🔥    typname   name  ->  pg_aggregate                                                 
🔥    typarray  oid   ->  0                                                            
🔥    typrelid  oid   ->  4294967108                                                   
🔥  
🔥  The decoder threw the following exception:
🔥  
🔥    java.lang.NumberFormatException: For input string: "4294967108"
🔥      java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:67)
🔥      java.base/java.lang.Integer.parseInt(Integer.java:662)
🔥      java.base/java.lang.Integer.parseInt(Integer.java:778)
🔥      scala.collection.StringOps$.toInt$extension(StringOps.scala:910)
🔥      skunk.util.Typer$ProtocolOps.$init$$$anonfun$4(Typer.scala:180)
...

Expected behavior:

Typer should parse PostgreSQL OIDs correctly, including values within the 32-bit unsigned integer range.

Actual behavior:

When encountering a PostgreSQL OID value that exceeds the range of a 32-bit signed integer, Integer.parseInt() throws a NumberFormatException, as it cannot handle values outside the signed integer range.

Proposed solution:

Instead of using Integer.parseInt(), use a method that can handle 32-bit unsigned integers, such as Long.parseLong() or explicitly convert the value to an unsigned integer representation.

@benhutchison
Copy link
Member

I'm also hitting this issue.

Postgres OID docs specify "The oid type is currently implemented as an unsigned four-byte integer. "

The relevant Skunk code is here. Oid's sadly isn't representable by an Int, unless we added an encoding that oids above max int were stored as negative ints?

val oid: Codec[Int] = Codec.simple(_.toString, _.toInt.asRight, Type.oid)

Actually, I hit this when trying out @rolang's Dumbo, a (hopefully faster) alternative to Flyway.

AFAICT, because it makes use of Typer.Strategy.SearchPath which requires querying database metadata, and so hits the oid codec problem.

@benhutchison
Copy link
Member

I did more investigation and thinking on how to solve this.

  1. CockroachDB apparently uses oids in the upper half of the unsigned range. Its also an awesome database that already works with Skunk, and I'd really like to support it.

  2. The surface area within Skunk of oid: Int is huge. I tried changing to oid: Long but the compile errors just kept coming in waves.

The saving grace is, we have enough bits in an Int to store an unsigned Int without losing info. We'll just have to accept the caveat, that for the rare case of oids > Int.MaxInt, they will be represented in Scala as a negative Int:

scala> (Int.MaxValue.toLong + 1).toInt
val res2: Int = -2147483648

With this encoding, the fix can be contained entire within the 1-line oid codec.

If maintainers are on board with this solution (which will be backwards compatible with all Postgres oids seen to date, that are less than Int.MaxInt), I'll send a PR?

@mpilquist
Copy link
Member

@benhutchison This seems reasonable to me. Happy to review a PR.

@rolang
Copy link
Contributor

rolang commented May 12, 2024

Actually, I hit this when trying out @rolang's Dumbo, a (hopefully faster) alternative to Flyway.

Offtopic, but since it was mentioned, from some of my testing locally Flyway was always way slower with CockroachDB. One of the results as example that I was running:

// flyway: 75 migrations executed in 58265ms
// dumbo: 75 migrations executed in 22780ms

Same migration scripts with the same dockerized single node instance and empty database with latest Flyway as of now 10.12.0. But don't take it as proper benchmarking 😅.

@benhutchison
Copy link
Member

Actually Flyway's performance when a database doesn't require any migration was what drove me nuts and led me to stumble on Dumbo. I have a CLI tool that extracts reports from a database, but includes a shared kernel that also checks migrations.

With Dumbo takes 10868ms to extract a report , with Flyway 19887ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants