Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PersistenceLifecycleEvent not sent to typed actor #141

Open
Yakimych opened this issue Jan 15, 2021 · 8 comments
Open

PersistenceLifecycleEvent not sent to typed actor #141

Yakimych opened this issue Jan 15, 2021 · 8 comments

Comments

@Yakimych
Copy link

Hi! Is there a way to hook in to PersistenceLifecycleEvent in typed actors? I've looked at #72 and the test, but this seems only to work for untyped actors.
Here is a fork with the same test for a typed actor (Eventsourced<string>), and it currently fails: Yakimych@c9dc995

My guess is that typed actors don't receive messages of anything but the specified type, so I am wondering if this is possible at all.

@Horusiath
Copy link
Owner

In your test, your actor uses messages of type string. In actor body, you take that string value, cast it to obj and then try to match it against PersistentLifecycleEvent - I don't see how possibly this could ever work.

Unfortunately, F# doesn't offer (untagged) type unions, so the only way for that is to define actor body via Eventsourced<obj> and then retype actor reference from IActorRef<obj>IActorRef<string> after creating actor using retype function to narrow the scope of allowed messages to the ones you care about.

@Yakimych
Copy link
Author

Yakimych commented Jan 15, 2021

retype does the trick, thanks! I've noticed, however, that the actor behaves differently now that it's untyped internally.
I am trying to debug a scenario when the actor crashes during Restore (most likely due to deserialization issue with a Discriminated Union). While it was typed, I could see in the logs that the actor got stopped (crashed) during restore, and that's why I tried to hook into the PersistentLifecycleEvent and find more information in ReplayFailed.
Now after I managed to do this via your suggestion, however, the actor no longer crashes during restore. I can only see ReplaySucceed and RecoveryCompleted, but the messages don't get replayed (so the actor ends up in the initial state).

While the deserialization problem can be solved by e.g. an EventAdapter, I am wondering if the behaviour in Akkling i correct as it stands right now. That is, shouldn't the untyped actor crash in the same way as a typed actor when deserialization of an event fails during replay? Currently it seems that it "pretends" that everything is fine, but fails silentry and the events never get replayed. I am still trying to get to the bottom of why it crashes with the default serializer, and was hoping to get some information about the cause/exception inside ReplayFailed.

@Yakimych
Copy link
Author

Yakimych commented Jan 17, 2021

@Horusiath Here is a (somewhat verbose) test to demonstrate the discrepancy: Yakimych@ad12da9

When I run the test for the untyped actor in Debug mode, I can see the following events logged:

  • PreStart
  • PostStop
  • PreRestart
  • PreStart
  • PostRestart
  • Unhandled message from [akka://test-system/deadLetters]: {
  "Case": "ActorEvent",
  "Fields": [
    {
      "Case": "StateUpdated",
      "Fields": [
        {
          "$": "I99"
        }
      ]
    }
  ]
}

The test fails on the last assert, since the state is 0 and 99 has not been restored, but there is no crash - the actor is running with incorrect state.

In the test for the typed actor, however, the actor crashes and the test hangs when running Ask with GetState:
<Received dead letter from [akka://test-system/temp/c]: ActorCommand GetState>

Is there any reason the typed and untyped actors behave differently?

NOTE: The Sqlite persistence plugin is used for those tests.

@Horusiath
Copy link
Owner

Horusiath commented Jan 17, 2021

From what I see, you're using Json.NET for persistence - this may be the reason, when trying to deserialize json payload into obj it doesn't know what specific type of object do you want - which in this case it will default to JDocument or JObject.

This is continuous problem of using JSON.NET, and one of the reasons why we started using Hyperion - but problem with Hyperion is that it's binary format is not stabilized and it doesn't provide any guarantees, that future version will keep it backwards compatible.

@Yakimych
Copy link
Author

You are right! After debugging a bit deeper, I can see that an exception is thrown in the __.Next method after matching on JObject, at jobj.ToObject<'Message>:

Newtonsoft.Json.JsonReaderException: Error reading integer. Unexpected token: StartObject. Path '[0]'.
   at Newtonsoft.Json.JsonReader.ReadAsInt32()
   at Newtonsoft.Json.JsonReader.ReadForType(JsonContract contract, Boolean hasConverter)
   ...

I can also see that OnRecoveryFailure is called with this exception, but I don't see anything in the logs - which made my initial debugging attempts quite a challenge. Is this exception swallowed somehow? Is there a way to make sure that it shows up in the logs even for typed actors?

I was also confused as to why the untyped actor doesn't end up along the same path, but I am guessing that JObject gets matched with | :? 'Message as msg -> first, since for untyped actors 'Message is obj here?

As to the serialization problem, I've seen those issues from 2015, but it seemed as they were solved with this PR? Have the problems reappeared later on?

@Horusiath
Copy link
Owner

Horusiath commented Jan 18, 2021

I think the reason here is that your persistence provider uses some configuration of JSON.NET, that doesn't support type name handling. There are two problems:

  1. Without type name handling, JSON.NET serializer won't know what object should it deserialize when serializer.DeserializeObject<obj> is called.
  2. With type name handling turned on, a fully qualified type name is used. So if you ever change the type name or namespace in the future, it won't be able to deserialize it back.

Usually when it comes to persistent serialization, I suggest people to write their own custom serializers (using eg. protobuf, FsPickler or whatever you want), to have full control of binary format and how it changes over time.

@Yakimych
Copy link
Author

Ok, that makes sense. I guess the question for Akkling specifically is whether we can do something to improve "debuggability" and help the next person who runs into this issue, since currently it pretty much fails silently with default settings when using Persistence.Sqlite. Is there any way to write something to the logs on OnRecoveryFailure even for a typed actor. When I debug Akkling source code I can see that it is called, but nothing is logged.

@Horusiath
Copy link
Owner

Yes, I think we could do something about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants