Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.to_json(orient='records') does not return records-based JSON #2211

Open
klenium opened this issue Dec 22, 2021 · 3 comments
Open

Series.to_json(orient='records') does not return records-based JSON #2211

klenium opened this issue Dec 22, 2021 · 3 comments

Comments

@klenium
Copy link

klenium commented Dec 22, 2021

df = ks.DataFrame([['a', 'b'], ['c', 'd']], columns=['col 1', 'col 2'])

def add_json(row):
  row['serialized_row_content'] = row.to_json()
  return row

df = df.apply(add_json, axis = 1)

print(df)

  col 1 col 2     serialized_row_content
0     a     b  {"col 1":"a","col 2":"b"}
1     c     d  {"col 1":"c","col 2":"d"}

That works as expected. The documentation says:

orient str, default ‘records’
It should be always ‘records’ for now.

So if instead of row.to_json() I write row.to_json(orient = 'records'), the output must be the same. But it's not:

  col 1 col 2 serialized_row_content
0     a     b              ["a","b"]
1     c     d              ["c","d"]

Which is rather the values format from Pandas.

@klenium klenium changed the title Series.to_json(orient='records') does not return JSON Series.to_json(orient='records') does not return records-based JSON Dec 22, 2021
@klenium
Copy link
Author

klenium commented Dec 22, 2021

Very interesting, I don't see the reason for this behavior in its source code. :)

@klenium
Copy link
Author

klenium commented Dec 22, 2021

row['type'] = str(type(row)) -> <class 'pandas.core.series.Series'>
Well that's unexpected, why is a Pandas Series used there?
Also why wouldn't it return records-based JSON uh.

@klenium
Copy link
Author

klenium commented Apr 10, 2022

The same applies to Pandas on Spark. If I follow the documentation and call to_json('records'), then the output is None thus I get errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant