Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Klaxon with kotlinx-serialization #603

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

devcrocod
Copy link
Contributor

This is the first step in migrating from klaxon to kotlinx-serialization #312

  • Removed klaxon dependency
  • Added kotlinx-serialization
  • Klaxon objects replaced with JsonElement from serialization
  • All other behavior is preserved. Unlike klaxon, kotlinx-serialization works better with primitives. For example, klaxon does not work with float, only with double, so I had to specially handle the float case. Also, nulls are wrapped in serialization.

This PR should be accepted after #574, after which I will resolve the conflicts.

@Jolanrensen Jolanrensen added the enhancement New feature or request label Feb 26, 2024
@Jolanrensen Jolanrensen added this to the 0.14.0 milestone Feb 26, 2024
build.gradle.kts Outdated Show resolved Hide resolved
@Jolanrensen
Copy link
Collaborator

What do you mean with "specially handle the float case". Looking at your code I see you skip it and convert it to a Double sometimes. I don't really understand why; DataFrame understands Floats just fine, so you can keep them as Floats

@zaleslaw zaleslaw self-requested a review February 26, 2024 12:38
@zaleslaw
Copy link
Collaborator

@devcrocod you wrote that this is a first step, what could be the next step?

@devcrocod
Copy link
Contributor Author

@devcrocod you wrote that this is a first step, what could be the next step?

Check related issue

@devcrocod
Copy link
Contributor Author

What do you mean with "specially handle the float case". Looking at your code I see you skip it and convert it to a Double sometimes. I don't really understand why; DataFrame understands Floats just fine, so you can keep them as Floats

By default, klaxon translates all floating-point numbers into Double. kotlinx-serialization can work with float. So, the question here is not what DataFrame can do, but what klaxon could do. It was possible to support Float in JSON parsing, but then it would have been necessary to additionally change the logic in JSON parsing. I decided not to do this because these subsequent parsers should ultimately change, and because it maximally preserves the previous behavior of the library.

@Jolanrensen
Copy link
Collaborator

Jolanrensen commented Feb 27, 2024

What do you mean with "specially handle the float case". Looking at your code I see you skip it and convert it to a Double sometimes. I don't really understand why; DataFrame understands Floats just fine, so you can keep them as Floats

By default, klaxon translates all floating-point numbers into Double. kotlinx-serialization can work with float. So, the question here is not what DataFrame can do, but what klaxon could do. It was possible to support Float in JSON parsing, but then it would have been necessary to additionally change the logic in JSON parsing. I decided not to do this because these subsequent parsers should ultimately change, and because it maximally preserves the previous behavior of the library.

I think the previous behavior of the library was not intended, but rather a limit imposed on the library by Klaxon. It seems odd to keep this broken behavior if both Kotlinx-serialization ánd DataFrame support Floats to convert them to Doubles. So yes, this might require updating some tests and logic, but it's ultimately for the best I believe.

@devcrocod
Copy link
Contributor Author

I agree that such behavior is not entirely correct, as it also carries computational errors when converting float to double. But I will repeat that this behavior will be changed in the future when serializers for dataframe objects are written.

Moreover, in the current version, there is a question:
During serialization, it's obvious how it should look:

1.1f -> 1.1 (float)
1.1 -> 1.100000 (double)

But during deserialization, we look at each element, and the logic is as follows: if the number fits within 32, then it's a float; if it's more than 32, then it's a double. And it turns out that our list can easily contain both Float and Double, and instead of the output type being Double, we get Number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants