Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add precision parameter to round function #13965

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

proddata
Copy link
Member

@proddata proddata commented Apr 14, 2023

Summary of the changes / Why this improves CrateDB

Enhanced the round() scalar function to allow a second input parameter for rounding precision.
When it is specified, the result's type is double precision. Notice that
round(number) and round(number, 0) return different result types.

    cr> select round(42.21,1) AS round;
    +-------+
    | round |
    +-------+
    |  42.2 |
    +-------+
    SELECT 1 row in set (... sec)

Implementation deviates from PostgreSQL. PostgreSQL always returns NUMERIC for round().
Is coherent with trunc() scalar function.

closes #13964

Checklist

  • Added an entry in CHANGES.txt for user facing changes
  • Updated documentation & sql_features table for user facing changes
  • Touched code is covered by tests
  • CLA is signed
  • This does not contain breaking changes, or if it does:
    • It is released within a major release
    • It is recorded in CHANGES.txt
    • It was marked as deprecated in an earlier release if possible
    • You've thought about the consequences and other components are adapted
      (E.g. AdminUI)

@proddata proddata force-pushed the proddata/round-precision branch 2 times, most recently from e6c7ffc to b609549 Compare April 14, 2023 11:46
@proddata proddata marked this pull request as ready for review April 14, 2023 12:15
@seut
Copy link
Member

seut commented Apr 14, 2023

Implementation deviates from PostgreSQL. PostgreSQL always returns NUMERIC for round().

What is the reason for this deviation?

@proddata
Copy link
Member Author

proddata commented Apr 14, 2023

What is the reason for this deviation?

In PostgreSQL: round and trunc generally returns a double precision without the precision parameter and a numeric with the precision parameter.
Also the precision parameter is only valid for numeric and integer input types

In CrateDB:

  • round without the precision parameter returns integer or bigint
  • trunc without the precision parameter returns integer or bigint
  • trunc with the precision parameter returns double and allows double as input

PostgreSQL15

> SELECT
pg_typeof(round(1::INT)),
pg_typeof(round(1.0::FLOAT4)),
pg_typeof(round(1.0::FLOAT8)),
pg_typeof(round(1::INT,0)),
pg_typeof(round(1.1::NUMERIC,0));
    pg_typeof     |    pg_typeof     |    pg_typeof     | pg_typeof | pg_typeof 
------------------+------------------+------------------+-----------+-----------
 double precision | double precision | double precision | numeric   | numeric
(1 row)

-----------

> SELECT
pg_typeof(trunc(1::INT)),
pg_typeof(trunc(1.0::FLOAT4)),
pg_typeof(trunc(1.0::FLOAT8)),
pg_typeof(trunc(1::INT,0)),
pg_typeof(trunc(1.1::NUMERIC,0));
    pg_typeof     |    pg_typeof     |    pg_typeof     | pg_typeof | pg_typeof 
------------------+------------------+------------------+-----------+-----------
 double precision | double precision | double precision | numeric   | numeric
(1 row)

-----------

> SELECT pg_typeof(round(1.0::FLOAT8,0));
No function matches the given name and argument types. You might need to add explicit type casts.


-----------

> SELECT pg_typeof(round(1.0::FLOAT8::NUMERIC,0));
 pg_typeof 
-----------
 numeric
(1 row)

CrateDB 5.4.0 (PR)

cr> SELECT pg_typeof(round(1::INT)),
    pg_typeof(round(1.0::FLOAT4)),
    pg_typeof(round(1.0::FLOAT8)),
    pg_typeof(round(1::INT,0)),
    pg_typeof(round(1.1::NUMERIC,0));
+-----------+-----------+----------+--------------------+--------------------+
| 'integer' | 'integer' | 'bigint' | 'double precision' | 'double precision' |
+-----------+-----------+----------+--------------------+--------------------+
| integer   | integer   | bigint   | double precision   | double precision   |
+-----------+-----------+----------+--------------------+--------------------+
SELECT 1 row in set (0.041 sec)


cr> SELECT
    pg_typeof(trunc(1::INT)),
    pg_typeof(trunc(1.0::FLOAT4)),
    pg_typeof(trunc(1.0::FLOAT8)),
    pg_typeof(trunc(1::INT,0)),
    pg_typeof(trunc(1.1::NUMERIC,0));
+-----------+-----------+----------+--------------------+--------------------+
| 'integer' | 'integer' | 'bigint' | 'double precision' | 'double precision' |
+-----------+-----------+----------+--------------------+--------------------+
| integer   | integer   | bigint   | double precision   | double precision   |
+-----------+-----------+----------+--------------------+--------------------+
SELECT 1 row in set (0.005 sec)

docs/appendices/release-notes/unreleased.rst Outdated Show resolved Hide resolved
docs/general/builtins/scalar-functions.rst Outdated Show resolved Hide resolved
}
double val = n.doubleValue();
int numDecimals = nd.intValue();
return BigDecimal.valueOf(val).setScale(numDecimals, RoundingMode.HALF_UP).doubleValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if it's not overflowing, (round(<val> * 10 ^ numDecimals)) / <val> * 10 ^ numDecimals) would be faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am on the fence if this might not better return a NUMERIC.

I would think that the primary use cases for this function is some sort of data presentation / reporting use case, where one would want to limit the amount digits in the fractional part. Using a floating point value, doesn't really guarantee this.

wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if it's not overflowing, (round( * 10 ^ numDecimals)) / * 10 ^ numDecimals) would be faster.

There is both the problem of overflowing and precision loss as e.g. pointed out here. While one could check overflow and precision to fit within the max safe integer bounds of a double, the checks become cumbersome and harder to maintain. As this would be primarily be a perform optimisation and I'd expect this function to be mostly used for presentation purposes after calculation on a final result set, I think it isn't worth it. Therefore I would suggest to stay with the BigDecimal approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly agree on using NUMERIC instead, any objections? @mfussenegger

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if it's not overflowing, (round( * 10 ^ numDecimals)) / * 10 ^ numDecimals) would be faster.

There is both the problem of overflowing and precision loss as e.g. pointed out here. While one could check overflow and precision to fit within the max safe integer bounds of a double, the checks become cumbersome and harder to maintain. As this would be primarily be a perform optimisation and I'd expect this function to be mostly used for presentation purposes after calculation on a final result set, I think it isn't worth it. Therefore I would suggest to stay with the BigDecimal approach.

Is detecting a possible overflow really that complex?
I do not completely follow you argumentation about the actual use-cases of this function as we never know for sure how users are using it.
But I'm also fine on using your approach now, we can follow up here anytime without a user-facing change

@proddata proddata force-pushed the proddata/round-precision branch 3 times, most recently from 46fa828 to d3f4479 Compare September 10, 2023 08:28
}
double val = n.doubleValue();
int numDecimals = nd.intValue();
return BigDecimal.valueOf(val).setScale(numDecimals, RoundingMode.HALF_UP).doubleValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly agree on using NUMERIC instead, any objections? @mfussenegger

}
double val = n.doubleValue();
int numDecimals = nd.intValue();
return BigDecimal.valueOf(val).setScale(numDecimals, RoundingMode.HALF_UP).doubleValue();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if it's not overflowing, (round( * 10 ^ numDecimals)) / * 10 ^ numDecimals) would be faster.

There is both the problem of overflowing and precision loss as e.g. pointed out here. While one could check overflow and precision to fit within the max safe integer bounds of a double, the checks become cumbersome and harder to maintain. As this would be primarily be a perform optimisation and I'd expect this function to be mostly used for presentation purposes after calculation on a final result set, I think it isn't worth it. Therefore I would suggest to stay with the BigDecimal approach.

Is detecting a possible overflow really that complex?
I do not completely follow you argumentation about the actual use-cases of this function as we never know for sure how users are using it.
But I'm also fine on using your approach now, we can follow up here anytime without a user-facing change

@surister
Copy link
Contributor

surister commented Feb 27, 2024

This would be needed for some tests in Tableau

@proddata
Copy link
Member Author

This would be needed for some tests in Tableau

I think the only real open point is the question about the return type. Otherwise this PR is sitting ready for a while now and could be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement round(v numeric, s int)
5 participants