Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra Line Feed in Json File creates a extra row in Hive ( and count is incorrect ) #207

Open
bjaggi opened this issue May 17, 2018 · 1 comment

Comments

@bjaggi
Copy link

bjaggi commented May 17, 2018

Hello,
i am using your serde for nested json mapping and works great.

We have a scenario where we have 2 lines feeds as delimiter ( Seems like hive only supports one \n, one more reason to go with a custom serde).

Same Input File :

{ "id": "1",

"id":"2"
}

when i do select * from hive_table or do count(*) hive is including a extra line feed. Expected output is 2 but hive shows count as 3.

I tried to change some code in this file

Link_To_JSONObject.java_Line318

New Logic : split text based on delimiter \n and then remove lines which are empty after trim.
Works fine on the test case, but not when i use in Hive. Any suggestions ?

@rcongiu
Copy link
Owner

rcongiu commented Aug 11, 2020

Mmm, do you have the complete json you're using ? Like an actual file ? The one you posted should not work at all since the serde only supports one json record per line without \n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants