Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Urgent: Data Corruption Issue Identified During DTO Deserialization (Instance Variable Renamed/Type Changed) #1068

Open
geekthread opened this issue Apr 18, 2024 · 7 comments

Comments

@geekthread
Copy link

Describe the bug
We've identified data corruption during DTO deserialization when there are changes in existing fields/instance variables in a dto

To Reproduce
Have a simple dto, lets say employee with following details

 public class Employee{
   private String name;
    private long age;
    private String company;
}

Serialize this dto using below code

Output outK3 = new Output(new FileOutputStream("rename.bin"))
kryo.writeObject(out, employee);
 outK3.close();

Change the dto

public class Employee{
  private String name;
   private short xge;
   private String company;
}

De-Serialize now using below code

Input input = new Input(new FileInputStream("rename.bin"));
       try {
           Employee k = kryo.readObject(input, s.getClass());
           System.out.println("Kryo 5 Test" + k);
           input.close();

Expected Behaviour
Since the dto has changes , kryo should either throw an exception or should give the correct data but we are seeing data corruption , example below

Kryo 5 TestEmployee{firstName='Abcde', age=36, company='Kronos'}  --- Original 
Kryo 5 TestEmployee{firstName='Abcde', age=19272, company='ronos'}  -- -- After De-Serializing

Environment:

  • OS: Windows
  • JDK Version: 8
  • Kryo Version: 5.6.0

Additional context
Add any other context about the problem here.

@theigl
Copy link
Collaborator

theigl commented Apr 18, 2024

What's your Kryo configuration? The default FieldSerializer does not support changes to the class structure. It doesn't write a schema for the class so it has no possibility to detect changes.

If you want to be able to evolve your serialized objects you have to use CompatibleFieldSerializer or one of the other serializers.

@NathanSweet
Copy link
Member

TaggedFieldSerializer is the right choice in most situations, IMO. Maybe the readme could make this more clear.

@geekthread
Copy link
Author

Thanks @NathanSweet @theigl , we are using default serializer(Field Serializer). Will check your recommendations and share updates.

@geekthread
Copy link
Author

We checked the serializers, TaggedFieldSerializer seems to be compatible but has performance overhead as compared to FieldSerializer. Is it possible for Kryo team to take this behaviour as a defect so that exception is thrown in all cases if kryo finds differences in schema with field serializer .

@theigl
Copy link
Collaborator

theigl commented May 6, 2024

FieldSerializer is the fastest generic serializer because it does not write a schema. That means there is nothing we can compare or validate.

What FieldSerializer writes looks something like this:

Employee.class36KronosAbcde

When reading this back, FieldSerializer first looks up the current class, iterates over all current fields of the class and calls the fields read method:

  1. ShortField.read()
  2. StringField.read()
  3. StringField.read()

In your case, all of these read calls succeed, so no exception is thrown. There is really nothing the FieldSerializer can do without adding some kind of schema or checksum.

@geekthread
Copy link
Author

Thank you for the prompt reply, @theigl. Could you please share or direct me to the performance metrics for the TaggedFieldSerializer? I'm interested in understanding the additional memory it might consume as compared to FieldSerializer

@theigl
Copy link
Collaborator

theigl commented May 7, 2024

Currently there is only this comparison in the readme, but it focuses on throughput and not memory.

For more data, you can always run FieldSerializerBenchmark via KryoBenchmarks yourself and enable CPU/Memory profiling in JMH.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants