Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hashAggregates: group by sum decimal issue #1554

Open
ssathi opened this issue Dec 8, 2020 · 1 comment
Open

hashAggregates: group by sum decimal issue #1554

ssathi opened this issue Dec 8, 2020 · 1 comment

Comments

@ssathi
Copy link

ssathi commented Dec 8, 2020

Aggregating negative decimal values produces incorrect results.

table:

charge_item_name (string) charge_type (string) charge_amount (decimal(16,2)
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -240.00
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -300.00
WHT tax -200.00
WHT tax -300.00
WHT tax -300.00

Spark SQL query:

SELECT itemchargingtype, itemdescription, count(*) AS `count`, sum(itemamount) AS `sum`
FROM `transaction_items`
where itemdescription = 'WHT'
GROUP BY itemdescription, itemchargingtype

result:

charge_type charge_item_name count sum
tax WHT 14 1,202.88

After disabling
snappydata.sql.hashAggregateSize=-1
snappydata.sql.useOptimizedHashAggregateForSingleKey=false

it produces correct values.

result:

charge_type charge_item_name count sum
tax WHT 14 -4040.00
@sumwale
Copy link
Contributor

sumwale commented Dec 8, 2020

Thanks for reporting.

Tested this and it looks to be a bug in the new BufferHashMap based implementation that reduces memory overhead for large aggregates and DISTINCT. For now you can switch to using the older implementation "set snappydata.sql.optimizedHashAggregate=false" which is as fast (and in many cases faster) than the newer one albeit may fail for very large aggregation/DISTINCT results. If this works for your use-cases it is much better than turning it off completely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants