Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BinaryFormatter removal #2324

Open
adamsitnik opened this issue Apr 3, 2024 · 1 comment
Open

BinaryFormatter removal #2324

adamsitnik opened this issue Apr 3, 2024 · 1 comment

Comments

@adamsitnik
Copy link
Contributor

adamsitnik commented Apr 3, 2024

Hello!

I am trying to wrap my head about the dependency on BinaryFormatter and I need some help.

  1. The only direct dependency to BinaryFormatter in non-test source code is the usage in BinaryObjectSerializer.
  2. BinaryObjectSerializer implements IObjectSerializer, it's spread in multiple places in the source code, but it seems that StdAdoDelegate is the only type that actually uses it (other places just store it to pass it somewhere else):
    private IObjectSerializer objectSerializer = null!;

    retValue = objectSerializer.Serialize(obj);

    obj = objectSerializer.DeSerialize<T>(data);
  3. It seems that the serialization exposed by StdAdoDelegate is used to:
    a) Serialize all implementations of public interface ICalendar:
    byte[]? baos = SerializeObject(calendar);

    byte[]? baos = SerializeObject(calendar);

    b) Serialize NameValueCollection (more or less a Dictionary<string, string> provided by BCL):
    NameValueCollection properties = ConvertToProperty(data.WrappedMap);
    retValue = SerializeObject(properties);

    c) Serialize sealed JobDataMap (more or less a custom wrapper for Dictionary<string, object>)
    return SerializeObject(data);

    d) Serialize all implementations of public interface IOperableTrigger:
    byte[]? buf = SerializeObject(trigger);

    byte[]? os = SerializeObject(trigger);

    e) Try to find what is not serializable in JobDataMap by trying to serialize all values and stopping on first that fails to serialize.
    try
    {
    return SerializeObject(data);
    }
    catch (SerializationException e)
    {
    ThrowHelper.ThrowSerializationException($"Unable to serialize JobDataMap for insertion into database because the value of property '{GetKeyOfNonSerializableValue(data)}' is not serializable: {e.Message}");

    protected object? GetKeyOfNonSerializableValue(IDictionary data)
    {
    foreach (var o in data)
    {
    var entry = (DictionaryEntry) o!;
    try
    {
    SerializeObject(entry.Value);

My questions:

  1. Are the observations listed above correct? Have I missed any serialization scenario?
  2. How common is for the users to provide custom ICalendar implementations and persist it? Do all of the users have to implement ICalendarSerializer?
  3. What kind of values can be used by the end users in JobDataMap? Just primitive types like int and string or perhaps this can be literally anything that is [Serializable], including user-defined types?
  4. The description of IOperableTrigger says that it's internal despite being public. Does it mean that users should not derive from this interface and we can focus only on the IOperableTrigger implementations provided by the library?
    /// Internal interface for managing triggers. This interface should not be used by the Quartz client.
  5. How common is for the users to derive a custom type from abstract class AbstractTrigger (that implements IOperableTrigger) and persist it?
    [Serializable]
    public abstract class AbstractTrigger : IOperableTrigger, IEquatable<AbstractTrigger>

Thanks!

@lahma
Copy link
Member

lahma commented Apr 3, 2024

Thank you for taking the interest in Quartz and its binary formatter issues! I'll give some background first which might help with other answers...

From version 1.0 onwards the binary formatter has been the solution to store state in database and also to communicate with remote Quartz scheduler. User's have been able to export IScheduler as a remoting service and then call it's methods (Start, Stop, Pause etc). Remoting support has (naturally) never been ported to to NET Core/nestandard2.0.

While serialization of custom calendars and triggers is possible (when database schema cannot handle their data), I'd think it's quite rare. I can't remember issues about such serialization - I might well be wrong too as billion dollar companies rarely come back telling about their successes.

Binary serialization for database was the natural choice when porting from Java version where it was also the way to go (as bad as it is in hindsight). Users can store anything that just serializes, but there's configuration switch which is recommended usePrroperties = true which turns the mode to allow only values to be strings. All following the Java logic. Also it's using byte[] kind of column data type in database which of course is not ideal storage format for JSON in modern databases.

From version 3.0 (released in 2017) onwards Quartz has required you to do an informed decision about the serializer being used, scheduler throwing error when persistence is configured without serialization implementation:

else if (js.GetType() != typeof(RAMJobStore))
{
// when we know for sure that job store does not need serialization we can be a bit more relaxed
// otherwise it's an error to not define the serialization strategy
initException = new SchedulerException($"You must define object serializer using configuration key '{serializerTypeKey}' when using other than RAMJobStore. " +
"Out of the box supported values are 'json' and 'binary'. JSON doesn't suffer from versioning as much as binary serialization but you cannot use it if you already have binary serialized data.");
throw initException;
}

The json maps to Newtonsoft.Json, but ideally we would differentiate between systemtextjson and newtonsoftjson instead of general name (v4 could break this).

3.x is somewhat in maintenance mode and v4 development (main) should/could drop whole support for binary serialization. There's HTTP API replacing remoting in v4 for NET 6.0+ and it might also be an option that Quartz will only support NET 6.0+ from v4 onwards - v3 could be considered battle-tested solution for full framework usage and still allowing both remoting and binary serialization on that platform.

Are the observations listed above correct? Have I missed any serialization scenario?

I believe you have found the most use cases, mainly it's the job data map that allows persisting state for a job/trigger between invocations and this state resides in database for cluster workers/failure resistance (RAMJobStore will forget everything in process restart).

How common is for the users to provide custom ICalendar implementations and persist it? Do all of the users have to implement ICalendarSerializer?

I'm not aware of custom implementations (or NuGet packages offering those). Calendar serializer and others will probably be useful when Newtonsoft.Json will go (at least tried) sunsetting phase. System.Text.Json implementation is still missing but I think v8 of STJ if mature enough to work here. I just haven't had the time.

What kind of values can be used by the end users in JobDataMap? Just primitive types like int and string or perhaps this can be literally anything that is [Serializable], including user-defined types?

It could be anything. Quartz documentation suggests that it's better to store only primitive types, but it's not enforced by default.

For migration scenarios I would probably create custom IObjectSerializer which would be able to read binary data and always write JSON, so a composite serializer allowing old format and serializing new format. Could be painful for large installations.

The description of IOperableTrigger says that it's internal despite being public. Does it mean that users should not derive from this interface and we can focus only on the IOperableTrigger implementations provided by the library?

IOperableTrigger offers services that Quartz needs to do its job. Services that should not be called or relied upon to be there by regular user code. Custom triggers should be quite rare case, maybe people would be overriding some existing ones, I don't have any data on this.

How common is for the users to derive a custom type from abstract class AbstractTrigger (that implements IOperableTrigger) and persist it?

No data for this either, but I'd say a rare case and could be considered as breakable for v4 and then later think how to support if needed by the billion dollar companies behind the curtains 😉

So as a short summary, think v4 in progress should probably not support binary serialization at all, but allow people to write migrating serializer reading data and writing JSON. It's all open for discussion though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants