Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to convert numbers between double and decimal in the same way as node.js, python3, ruby, rust or so #172

Open
kubo opened this issue Apr 9, 2022 · 5 comments

Comments

@kubo
Copy link

kubo commented Apr 9, 2022

Request in short

Could you add an option to use dtoa.c by David M. Gay or similar code to convert values between DPI_NATIVE_TYPE_DOUBLE and DPI_ORACLE_TYPE_NUMBER.

dtoa.c is used by ruby and python3.

Though I don't know whether node.js uses dtoa.c or not, ECMAScript definition suggests to refer dtoa.c as follows.

Implementers of ECMAScript may find useful the paper and code written by David M. Gay for binary-to-decimal conversion of floating-point numbers:

Gay, David M. Correctly Rounded Binary-Decimal and Decimal-Binary Conversions. Numerical Analysis, Manuscript 90-10. AT&T Bell Laboratories (Murray Hill, New Jersey). 30 November 1990. Available as
http://ampl.com/REFS/abstracts.html#rounding. Associated code available as
http://netlib.sandia.gov/fp/dtoa.c and as
http://netlib.sandia.gov/fp/g_fmt.c and may also be found at the various netlib mirror sites.

Rust uses same algorithm with dtoa.c as a fallback of Grisu algorithm.

I think it can be included in ODPI-C because MySQL includes code based on dtoa.c.

Background

As far as I checked, node.js, python3, ruby and rust seem to convert numbers between double and decimal in the same way.

  • Different double values are converted to different string values even when the difference is only one bit.
  • When a double value is converted to a string and the string is converted to a double, the original and converted double values are exactly same.

Here is a python3 code to check above. Any value between 2.3 and 2.30000000000001 is converted to string differently. No round trip errors are displayed. I made similar codes for node.js, ruby, rust and go. They printed exactly same results.

from struct import pack, unpack

dstart = 2.3
dend = 2.30000000000001
istart = unpack('Q', pack('d', dstart))[0]
iend = unpack('Q', pack('d', dend))[0]

# increment bits in floating point number one by one from dstart to dend
for i in range(istart, iend + 1):
    dval = unpack('d', pack('Q', i))[0]
    sval = str(dval)
    round_trip_dval = float(sval)
    if dval != round_trip_dval:
        print("round trip error {} != {}".format(dval, round_trip_dval), file=sys.stderr)
    print("0x{:x},{}".format(i, sval))

I think that this issue is resolved when decimal to binary conversion is exactly same with ODPI-C and languages using ODPI-C.

A floating point number 2.3 in node.js, python3, ruby and rust consists of bits 100000000000010011001100110011001100110011001100110011001100110.
It is inserted to an Oracle number column as 2.3. But when the column is fetched using DPI_NATIVE_TYPE_DOUBLE, the fetched value consists of bits 100000000000010011001100110011001100110011001100110011001100111 (the last one bit is set.) It is displayed as 2.3000000000000003 because different double values are converted to different string values in the languages. If ODPI-C converts Oracle number, represented as decimal number, to floating point number as the languages do, it is displayed as 2.3.

FYI

Ruby includes dtoa.c from util.c as follows to rename public function names.

In https://github.com/ruby/ruby/blob/v3_1_1/util.c#L610-L616:

#undef strtod
#define strtod ruby_strtod
#undef dtoa
#define dtoa ruby_dtoa
#undef hdtoa
#define hdtoa ruby_hdtoa
#include "missing/dtoa.c"

ruby_strtod is used to convert string to double.
ruby_dtoa is used to convert double to string.

@tgulacsi
Copy link

tgulacsi commented Apr 9, 2022

It's not zhat easy: insert 191/200 in Oracle DB into a num NUMBER column.
That's 0.955.
Then SELECT num-0.955 FROM table.
That will be 10^-39.

I don't think that hiding this in a driver would do any good...

@kubo
Copy link
Author

kubo commented Apr 9, 2022

@tgulacsi
Could you post details? What programming language or tool did you use? How did you insert and fetch numbers?
I cannot reproduce it by sqlplus and rust-oracle.

I think that the inserted value is not 0.955.
If it is exactly 0.955, SELECT dump(num) FROM table is Typ=2 Len=3: 192,96,51.
I guess that dump(num) is Typ=2 Len=21: 192,96,51,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,?? (0.95500000000000000000000000000000000000??) in the your table because num-0.955 is 10^-39.

@tgulacsi
Copy link

You're right, it depends on the column's data type - NUMBER works as expected,
but if you specify a precision, it degrades to using floating-point:

CREATE TABLE w_gt_test_n (n NUMBER, n38 NUMBER(38,38)) TABLESPACE DATA;
INSERT INTO W_gt_test_n VALUES (0.955, 0.955);
INSERT INTO W_gt_test_n VALUES (191/200, 191/200);
SELECT n, DUMP(n), n38, DUMP(n38), DUMP(n-n38) FROM w_gt_test_n;
DROP TABLE w_gt_test_n;

PL/SQL Developer (Delphi):

N	DUMP(N)	N38	DUMP(N38)	DUMP(N-N38)
0,955	Typ=2 Len=3: 192,96,51	0,95500000000000013	Typ=2 Len=3: 192,96,51	Typ=2 Len=1: 128
0,955	Typ=2 Len=3: 192,96,51	0,95500000000000013	Typ=2 Len=3: 192,96,51	Typ=2 Len=1: 128

SQL*Plus:

         N
----------
DUMP(N)
--------------------------------------------------------------------------------
       N38
----------
DUMP(N38)
--------------------------------------------------------------------------------
DUMP(N-N38)
--------------------------------------------------------------------------------
      ,955
Typ=2 Len=3: 192,96,51
      ,955
Typ=2 Len=3: 192,96,51
Typ=2 Len=1: 128

      ,955
Typ=2 Len=3: 192,96,51
      ,955
Typ=2 Len=3: 192,96,51
Typ=2 Len=1: 128

So yes, I've also fallen: it's only the displaying library's error.

@kubo
Copy link
Author

kubo commented Apr 11, 2022

I think that the feature is useful for node-oracledb. But it isn't for cx_Oracle (if I don't misunderstand this) and rust-oracle because they use DPI_NATIVE_TYPE_BYTES for numbers and access numbers via strings.
If it isn't useful for drivers other than what I checked, feel free to close it. If only one driver needs it, it is almost dead code.

Well, I thought that cx_Python also used DPI_NATIVE_TYPE_DOUBLE for numbers as node-oracledb did and it was worth to add the feature. But cx_Python have used DPI_NATIVE_TYPE_BYTES for numbers since this commit. I'm not sure now whether it is worth or not.

@cjbj
Copy link
Member

cjbj commented Apr 11, 2022

We'll definitely evaluate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants