Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization Exception #134

Open
artemiusgreat opened this issue Jun 26, 2020 · 2 comments
Open

Serialization Exception #134

artemiusgreat opened this issue Jun 26, 2020 · 2 comments

Comments

@artemiusgreat
Copy link

artemiusgreat commented Jun 26, 2020

Hi.

Not sure what happened, but something happened :)

  1. I updated all projects from .NET Core 3.0 to 3.1
  2. Made some changes in MySQL DB

Now, I get various errors from GenericXmlDataContractSerializer. Surprisingly, exception happens only when I build project the second time, after the first build it works fine. I mentioned DB, because I serialize trained model using MemoryStream and save it as a byte[] to the MySQL column of type LongBlob. I'm also using models from ML.NET and export / import them from DB the same way and they work fine, so probably DB is not an issue. All projects in the solution are built as x64. Serializer fails on any model, either RandomForest or AdaBoost, with the same exception.

The issue

  1. build the project and start debugging
  2. create, train model, and save it to DB as byte array using GetPredictor method below
  3. select byte array from DB, deserialize to a model, provide test data and get estimate - OK
  4. stop debugging, repeat steps 1-3, now prediction method fails with the exception below - NOT OK

The question

Maybe somebody knows what could be the reason for serializer to fall with the exception? Also, can I serialize trained model to MemoryStream using different serializer, without GenericXmlDataContractSerializer?

Most common exception

System.Runtime.Serialization.SerializationException: Element 'http://schemas.datacontract.org/2004/07/Core.Learners.SharpLearning.EngineSpace:Model' contains data from a type that maps to the name 'SharpLearning.RandomForest.Models:ClassificationForestModel'. The deserializer has no knowledge of any type that maps to this name. Consider changing the implementation of the ResolveName method on your DataContractResolver to return a non-null value for name 'ClassificationForestModel' and namespace 'SharpLearning.RandomForest.Models' 

After updating all Nuget packages I got another exception only once

Invalid XML at line 1 or something like that

Serializing trained model to byte array and save to DB

public virtual ResponseModel<byte> GetPredictor(IDictionary<int, string> columns, IDataView inputs)
{
  var responseModel = new ResponseModel<byte>();

  using (var memoryStream = new MemoryStream())
  {
    var processor = GetInput(columns, inputs, nameof(PredictorLabelsEnum.Emotion));
    var learner = new ClassificationRandomForestLearner();
    var serializer = new GenericXmlDataContractSerializer();
    var container = new MapModel<int, string>
    {
      Map = processor.Map,
      Model = learner.Learn(processor.Input.Observations, processor.Input.Targets)
    };
    
    serializer.Serialize(container, () => new StreamWriter(memoryStream));
    responseModel.Items = memoryStream.ToArray().ToList();
  }

  return responseModel;
}

Deserializing model from DB stream and getting prediction

public virtual ResponseModel<string> GetEstimate(IEnumerable<byte> predictor, IDictionary<int, string> columns, IDataView inputs)
{
  var responseModel = new ResponseModel<string>();

  using (var memoryStream = new MemoryStream(predictor.ToArray()))
  {
    var processor = GetInput(columns, inputs);
    var serializer = new GenericXmlDataContractSerializer();
    var model = serializer.Deserialize<MapModel<int, string>>(() => new StreamReader(memoryStream));
    var predictions = model.Predict(processor.Input.Observations);

    responseModel.Items.Add(predictions.OrderByDescending(o => o.Key).First().Value);
  }

  return responseModel;
}

Method GetInput in the code above is just a conversion from IDataView format in ML.NET to ObservationSet format in SharpLearning. MapModel is a wrapper that allows to save text labels along with numeric ones.

@artemiusgreat
Copy link
Author

Actually, just tried to use custom serializer and it seems to work.

@mdabros
Copy link
Owner

mdabros commented Jun 27, 2020

Hi @artemiusgreat,

Yes, it should be possible to use a another serializer to serialize/deserialize the models.

From the exception my best guess is that the project deserializing the model is missing a reference to the assemblies containing ClassificationForestModel and the ClassificationAdaBoostModel. Note, that the project might have the correct references, but if the types are not used anywhere in the code, the references might be optimized away.

To ensure that the references are kept you can try adding them as known types to the GenericXmlDataContractSerializer like this:

var knownTypes = new[]
{
    typeof(ClassificationForestModel),
    typeof(ClassificationAdaBoostModel),
};

var serializer = new GenericXmlDataContractSerializer(knownTypes);

Best regards
Mads

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants