Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event Distribution for cache store event channel does not fail over to another cachestore when a JVM is stopped #106

Open
brianoliver opened this issue Apr 17, 2014 · 13 comments

Comments

@brianoliver
Copy link
Contributor

I see an issue on even processor framework when I tested fail over cases, not sure if that due to any incorrect configuration. I ran 2 separate cache server jvm and did an update, it worked correctly. Killed one of the cache server JVMs, and tested the updated use case, it failed with following exception. Expectation is the other JVM needs to process that, but it throws the error as below

re Cache}, controllerDependencies=AbstractEventChannelController.Dependencies

{ch annelName=cachestore Cache, externalName=Site1:cluster:0xA6DB:UpdateBillingProfi leEntity:cachestore Cache, eventChannelBuilder=com.oracle.coherence.patterns.eve ntdistribution.channels.CacheStoreEventChannelBuilder@65582116, transformerBuild er=null, startingMode=ENABLED, batchDistributionDelayMS=1000, batchSize=100, res tartDelay=10000, totalConsecutiveFailuresBeforeSuspended=-1, eventPollingDelay=1 000}

, cacheName=UpdateBillingProfileEntity, resolver=com.tangosol.config.express
ion.ScopedParameterResolver@7ce45f54}]
at com.oracle.coherence.common.liveobjects.LiveObjectEventInterceptor.on
Event(LiveObjectEventInterceptor.java:215)
at com.tangosol.net.events.internal.NamedEventInterceptor.onEvent(NamedE
ventInterceptor.java:258)
at com.tangosol.net.events.internal.AbstractEvent.nextInterceptor(Abstra
ctEvent.java:116)
at com.tangosol.net.events.internal.AbstractEvent.dispatch(AbstractEvent
.java:154)
at com.tangosol.net.events.internal.AbstractEventDispatcher$4.proceed(Ab
stractEventDispatcher.java:270)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.g
rid.PartitionedService$Continuations$Task.run(PartitionedService.CDB:6)
at com.tangosol.coherence.component.util.daemon.queueProcessor.Service$E
ventDispatcher.onNotify(Service.CDB:26)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.oracle.coherence.common.liveobjects.LiveObjectEventInterceptor.on
Event(LiveObjectEventInterceptor.java:211)
... 8 more
Caused by: java.lang.NullPointerException
at com.oracle.coherence.patterns.eventdistribution.distributors.coherenc
e.CoherenceEventChannelSubscription.onEntryUpdated(CoherenceEventChannelSubscrip
tion.java:271)
... 13 more

@brianoliver
Copy link
Contributor Author

Reported by naheedmk

@brianoliver
Copy link
Contributor Author

@brianoliver said:
A few things that may help resolving the issue:

  1. The stack trace seems to be incomplete, especially the top of it.

  2. Can you provide a clear step-by-step example of how to reproduce the issue and/or perhaps add a new functional test to the AbstractPushReplicationTest class?

eg:
i). start two storage enabled cluster members using a specified configuration.
ii). insert/update/remove an entry.
iii). destroy/kill one of the storage enabled cluster members.
iv). insert/update/remove an entry (the previous entry)

@brianoliver
Copy link
Contributor Author

@brianoliver said:
It appears this issue is caused by missing @OnArrived and @OnDeparting annotations on the CoherenceEventChannelSubscription and JMSEventChannelControllerConfiguration LiveObjects.

Interestingly I could only reproduce this reliably when running on a later (unreleased) version of Coherence, but regardless it is an issue.

This will be resolved in the next patch Coherence Incubator release (patch) 12.2.1

@brianoliver
Copy link
Contributor Author

@naheedmk said:
Brian,
I tested with 12.2.1 SNAPSHOT, now my application is failing with following exception. It is failing even for the previously successful usecases.

The exception is Caused by: (Wrapped) java.io.NotSerializableException: com.tangosol.coherence.config.scheme.ClassScheme

Use case details :

Put a Pojo to a cache that has cache scheme configured to pushreplication publishing cache store. And push replication has 2 channels one is cachestore channel and another is remote cluster channel.

Inside Update billing profile is com.intuit.schema.enterprisecommerce.billing.profile.v1.BillingProfile@1299da96[billingProfileId=TRN-58XV6BR, preferred=false, name=TRN-58XV6BR, customerAccountNumber=845093002, accountId=, status=Active, dayOfMonth=14, description=, contactNumber=2422233, contactId=, paymentMethod=com.intuit.schema.enterprisecommerce.billing.profile.v1.BillingProfile$PaymentMethod@17ed2802[creditCard=com.intuit.schema.enterprisecommerce.billing.profile.v1.BillingProfile$PaymentMethod$CreditCard@52dda793[cardHolderName=Discover card user, accountType=DISCOVER, accountNumber=XXXXXXXXXXXX0000, expirationMonth=12, expirationYear=2014], eft=], addressNumber=5714955, addressId=]
[25/04/14 10:42:06:006 CDT] INFO rest.BillingProfileResource: Billing Profile is not null:TRN-58XV6BR
[25/04/14 10:42:06:006 CDT] INFO controller.BillingProfileController: Acct Ref Id=845093002
(Wrapped: Failed request execution for DistributedCache service on Member(Id=1, Timestamp=2014-04-25 10:41:21.179, Address=192.168.1.7:8088, MachineId=4855, Location=site:Site1,machine:knaheed-PC2,process:57860, Role=IntuitBillingProfileCacheDriver) (Wrapped: Failed to store key="0715bd97-c868-430f-96aa-a7818f8a04c0") (Wrapped) com.tangosol.coherence.config.scheme.ClassScheme) (Wrapped) java.io.NotSerializableException: com.tangosol.coherence.config.scheme.ClassScheme
at com.tangosol.util.Base.ensureRuntimeException(Base.java:286)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.tagException(Grid.CDB:50)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPartialCommit(PartitionedCache.CDB:5)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutRequest(PartitionedCache.CDB:50)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$PutRequest.run(PartitionedCache.CDB:1)
at com.tangosol.coherence.component.util.DaemonPool.add(DaemonPool.CDB:20)
at com.tangosol.coherence.component.util.DaemonPool.add(DaemonPool.CDB:1)
at com.tangosol.coherence.component.net.message.requestMessage.DistributedCacheKeyRequest.onReceived(DistributedCacheKeyRequest.CDB:2)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:38)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:23)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.PartitionedService.onNotify(PartitionedService.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onNotify(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:51)
at java.lang.Thread.run(Thread.java:744)
Caused by: (Wrapped) java.io.NotSerializableException: com.tangosol.coherence.config.scheme.ClassScheme
at com.tangosol.util.ExternalizableHelper.toBinary(ExternalizableHelper.java:219)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$ConverterValueToBinary.convert(PartitionedCache.CDB:3)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$ViewMap.invoke(PartitionedCache.CDB:21)
at com.tangosol.coherence.component.util.SafeNamedCache.invoke(SafeNamedCache.CDB:1)
at com.oracle.coherence.patterns.eventdistribution.distributors.coherence.CoherenceEventDistributor.establishEventChannelController(CoherenceEventDistributor.java:173)
at com.oracle.coherence.patterns.eventdistribution.configuration.EventDistributorTemplate.realize(EventDistributorTemplate.java:263)
at com.oracle.coherence.patterns.pushreplication.PublishingCacheStore$1.ensureResource(PublishingCacheStore.java:198)
at com.oracle.coherence.patterns.pushreplication.PublishingCacheStore$1.ensureResource(PublishingCacheStore.java:140)
at com.oracle.coherence.common.resourcing.AbstractDeferredSingletonResourceProvider.getResource(AbstractDeferredSingletonResourceProvider.java:85)
at com.oracle.coherence.patterns.pushreplication.PublishingCacheStore.distribute(PublishingCacheStore.java:304)
at com.oracle.coherence.patterns.pushreplication.PublishingCacheStore.store(PublishingCacheStore.java:495)
at com.tangosol.net.cache.ReadWriteBackingMap$BinaryEntryStoreWrapper.storeInternal(ReadWriteBackingMap.java:6112)
at com.tangosol.net.cache.ReadWriteBackingMap$StoreWrapper.store(ReadWriteBackingMap.java:4890)
at com.tangosol.net.cache.ReadWriteBackingMap.putInternal(ReadWriteBackingMap.java:1283)
at com.tangosol.net.cache.ReadWriteBackingMap.put(ReadWriteBackingMap.java:745)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.putPrimaryResource(PartitionedCache.CDB:47)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.postPut(PartitionedCache.CDB:32)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache$Storage.put(PartitionedCache.CDB:23)
at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.partitionedService.PartitionedCache.onPutRequest(PartitionedCache.CDB:37)
... 10 more
Caused by: java.io.NotSerializableException: com.tangosol.coherence.config.scheme.ClassScheme
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347)
at com.tangosol.util.ExternalizableHelper.writeSerializable(ExternalizableHelper.java:2298)
at com.tangosol.util.ExternalizableHelper.writeObject(ExternalizableHelper.java:2437)
at com.oracle.coherence.patterns.eventdistribution.channels.CacheStoreEventChannelBuilder.writeExternal(CacheStoreEventChannelBuilder.java:135)
at com.tangosol.util.ExternalizableHelper.writeExternalizableLite(ExternalizableHelper.java:2111)
at com.tangosol.util.ExternalizableHelper.writeObject(ExternalizableHelper.java:2481)
at com.oracle.coherence.patterns.eventdistribution.distributors.AbstractEventChannelController$DefaultDependencies.writeExternal(AbstractEventChannelController.java:1274)
at com.tangosol.util.ExternalizableHelper.writeExternalizableLite(ExternalizableHelper.java:2111)
at com.tangosol.util.ExternalizableHelper.writeObject(ExternalizableHelper.java:2481)
at com.oracle.coherence.patterns.eventdistribution.distributors.coherence.CoherenceEventChannelSubscription.writeExternal(CoherenceEventChannelSubscription.java:341)
at com.tangosol.util.ExternalizableHelper.writeExternalizableLite(ExternalizableHelper.java:2111)
at com.tangosol.util.ExternalizableHelper.writeObject(ExternalizableHelper.java:2481)
at com.oracle.coherence.patterns.messaging.entryprocessors.TopicSubscribeProcessor.writeExternal(TopicSubscribeProcessor.java:172)
at com.tangosol.util.ExternalizableHelper.writeExternalizableLite(ExternalizableHelper.java:2111)
at com.tangosol.util.ExternalizableHelper.writeObjectInternal(ExternalizableHelper.java:2712)
at com.tangosol.util.ExternalizableHelper.serializeInternal(ExternalizableHelper.java:2646)
at com.tangosol.util.ExternalizableHelper.toBinary(ExternalizableHelper.java:215)
... 28 more

thanks
Naheed

@brianoliver
Copy link
Contributor Author

@brianoliver said:
Hi Naheed,

Are you using Coherence 12.1.2.0.1+? In 12.1.2.0.0 the class wasn't serializable. In 12.1.2.0.1+ it is.

– Brian

@brianoliver
Copy link
Contributor Author

@naheedmk said:
Brian,
I am using
Bundle-Version: 12.1.2.1

Thanks
Naheed

@brianoliver
Copy link
Contributor Author

@naheedmk said:
Brian,
I tried with version 12.1.2.0.2 and got the same exception

Thanks
Naheed

@brianoliver
Copy link
Contributor Author

@brianoliver said:
Hi Naheed,

Can you attach your Cache Configuration file and possibly your CacheStore?

– Brian

@brianoliver
Copy link
Contributor Author

@naheedmk said:
I have created a test case to reproduce the issue, sent it via your email.

Thanks
Naheed

@brianoliver
Copy link
Contributor Author

@brianoliver said:
(from Kunnummal Naheed Madathummal naheedmk@yahoo.com)

@brianoliver
Copy link
Contributor Author

@brianoliver said:
Hi Naheed,

If you change the in your cache store configuration to be , it should all work. The example you sent me works when I've done this (no longer fails).

I'll update the documentation for this issue and raise an issue against the Coherence Product to make serializable.

Thanks for you helps locating this issue.

– Brian

@brianoliver
Copy link
Contributor Author

@brianoliver said:
After looking at this in terms of making serializable with in Coherence, we've decided this is the wrong path to take. It would internally force classes that are currently not serializable and may never be serializable to become serializable.

Instead what we'll do is "re-write" / "transform" uses of with in the Event Distribution configuration as schemes. This ensure backwards compatibility.

@brianoliver
Copy link
Contributor Author

This issue was imported from JIRA COHINC-106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant