Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignite FileAlreadyExistsException during WAL Archival in GKE Kubernetes Cluster #2170

Open
sumanentc opened this issue Nov 9, 2021 · 0 comments

Comments

@sumanentc
Copy link

We are using Gridgain Version : 8.8.10 JDK Version : 1.8

We have Ignite cluster with 3 nodes in GCP Kubernetes. We have enabled native persistence. Some of our Ignite pods are going into the CrashLoopBackOff with the below exception.

[07:45:45,477][WARNING][main][FileWriteAheadLogManager] Content of WAL working directory needs rearrangement, some WAL segments will be moved to archive: /gridgain/walarchive/node00-71fcf5d3-faf7-4d2b-abae-bd0621bb12a1. Segments from 0000000000000001.wal to 0000000000000008.wal will be moved, total number of files: 8. This operation may take some time. [07:45:45,480][SEVERE][main][IgniteKernal] Exception during start processors, node will be stopped and close connections class org.apache.ignite.IgniteCheckedException: Failed to start processor: GridProcessorAdapter [] at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1938) at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1159) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:1787) at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1711) at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1141) at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1059) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:945) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:844) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:714) at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:683) at org.apache.ignite.Ignition.start(Ignition.java:344) at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:290) Caused by: class org.apache.ignite.internal.processors.cache.persistence.StorageException: Failed to move WAL segment [src=/gridgain/wal/node00-71fcf5d3-faf7-4d2b-abae-bd0621bb12a1/0000000000000001.wal, dst=/gridgain/walarchive/node00-71fcf5d3-faf7-4d2b-abae-bd0621bb12a1/0000000000000001.wal] at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.moveSegmentsToArchive(FileWriteAheadLogManager.java:3326) at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.prepareAndCheckWalFiles(FileWriteAheadLogManager.java:1542) at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.start0(FileWriteAheadLogManager.java:494) at org.apache.ignite.internal.processors.cache.GridCacheSharedManagerAdapter.start(GridCacheSharedManagerAdapter.java:60) at org.apache.ignite.internal.processors.cache.GridCacheProcessor.start(GridCacheProcessor.java:605) at org.apache.ignite.internal.IgniteKernal.startProcessor(IgniteKernal.java:1935) ... 11 more Caused by: java.nio.file.FileAlreadyExistsException: /gridgain/walarchive/node00-71fcf5d3-faf7-4d2b-abae-bd0621bb12a1/0000000000000001.wal at java.base/sun.nio.fs.UnixCopyFile.move(UnixCopyFile.java:450) at java.base/sun.nio.fs.UnixFileSystemProvider.move(UnixFileSystemProvider.java:267) at java.base/java.nio.file.Files.move(Files.java:1422) at org.apache.ignite.internal.processors.cache.persistence.wal.FileWriteAheadLogManager.moveSegmentsToArchive(FileWriteAheadLogManager.java:3307) ... 16 more [07:45:45,482][SEVERE][main][IgniteKernal] Got exception while starting (will rollback startup routine).

It seems like the during WAL Archival it is creating file with the same name of WAL and it is not able to override the WAL archival file. We are using different volume mount for WAL and data dir as mentioned in the wiki :
https://www.gridgain.com/docs/latest/installation-guide/kubernetes/gke-deployment

we are using the below configuration .

Is the any specific configuration during WAL Archival which we are missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant