Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tikv panic repeatedly with “\"[region 16697056] 19604003 applying snapshot failed\"” after down this tikv for 20mins and recover #16958

Closed
Lily2025 opened this issue May 7, 2024 · 1 comment

Comments

@Lily2025
Copy link

Lily2025 commented May 7, 2024

Bug Report

What version of TiKV are you using?

./tikv-server -V
TiKV
Release Version: 8.1.0
Edition: Community
Git Commit Hash: 56613f7
Git Commit Branch: HEAD
UTC Build Time: 2024-04-30 06:17:25
Rust Version: rustc 1.77.0-nightly (89e2160c4 2023-12-27)
Enable Features: pprof-fp jemalloc mem-profiling portable sse test-engine-kv-rocksdb test-engine-raft-raft-engine trace-async-tasks openssl-vendored
Profile: dist_release
2024-05-01T03:12:03.865+0800

What operating system and CPU are you using?

8c/32g

Steps to reproduce

1、run tpcc
2、down one of tikv and then recover after 20mins

What did you expect?

no panic

What did happened?

after tikv recover from down,this tikv panic repeatedly
2024-05-01 10:57:45
{"pod":"tc-tikv-1","log":"[lib.rs:477] [\"[region 16697056] 19604003 applying snapshot failed\"] [backtrace=\" 0: tikv_util::set_panic_hook::{{closure}}\\n at workspace/source/tikv/components/tikv_util/src/lib.rs:476:18\\n 1: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2029:9\\n std::panicking::rust_panic_with_hook\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:783:13\\n 2: std::panicking::begin_panic_handler::{{closure}}\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:657:13\\n 3: std::sys_common::backtrace::__rust_end_short_backtrace\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:171:18\\n 4: rust_begin_unwind\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:645:5\\n 5: core::panicking::panic_fmt\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:72:14\\n 6: raftstore::store::peer_storage::PeerStorage<EK,ER>::check_applying_snap\\n at workspace/source/tikv/components/raftstore/src/store/peer_storage.rs:799:21\\n 7: raftstore::store::peer::Peer<EK,ER>::check_snap_status\\n at workspace/source/tikv/components/raftstore/src/store/peer.rs:2530:15\\n raftstore::store::peer::Peer<EK,ER>::handle_raft_ready_append\\n at workspace/source/tikv/components/raftstore/src/store/peer.rs:2650:13\\n 8: raftstore::store::fsm::peer::PeerFsmDelegate<EK,ER,T>::collect_ready\\n at workspace/source/tikv/components/raftstore/src/store/fsm/peer.rs:2132:19\\n <raftstore::store::fsm::store::RaftPoller<EK,ER,T> as batch_system::batch::PollHandler<raftstore::store::fsm::peer::PeerFsm<EK,ER>,raftstore::store::fsm::store::StoreFsm<EK>>>::handle_normal\\n at workspace/source/tikv/components/raftstore/src/store/fsm/store.rs:1090:13\\n 9: batch_system::batch::Poller<N,C,Handler>::poll\\n at workspace/source/tikv/components/batch-system/src/batch.rs:380:27\\n 10: raftstore::store::worker::refresh_config::PoolController<N,C,H>::increase_by::{{closure}}\\n at workspace/source/tikv/components/raftstore/src/store/worker/refresh_config.rs:84:21\\n <std::thread::Builder as tikv_util::sys::thread::StdThreadBuildWrapper>::spawn_wrapper::{{closure}}\\n at workspace/source/tikv/components/tikv_util/src/sys/thread.rs:438:13\\n std::sys_common::backtrace::__rust_begin_short_backtrace\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:155:18\\n 11: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:529:17\\n <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:272:9\\n std::panicking::try::do_call\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:552:40\\n std::panicking::try\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:516:19\\n std::panic::catch_unwind\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:142:14\\n std::thread::Builder::spawn_unchecked_::{{closure}}\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:528:30\\n core::ops::function::FnOnce::call_once{{vtable.shim}}\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5\\n 12: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2015:9\\n <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2015:9\\n std::sys::unix::thread::Thread::new::thread_start\\n at root/.rustup/toolchains/nightly-2023-12-28-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/thread.rs:108:17\\n 13: start_thread\\n 14: clone\\n\"] [location=/workspace/source/tikv/components/raftstore/src/store/peer_storage.rs:799] [thread_name=raftstore-1-0] [thread_id=112]","container":"tikv","namespace":"ha-test-tikv-failure-rpt-tps-7574381-1-208","level":"FATAL"}

@Lily2025 Lily2025 added the type/bug Type: Issue - Confirmed a bug label May 7, 2024
@Lily2025 Lily2025 changed the title tikv panic repeatedly with “\"[region 16697056] 19604003 applying snapshot failed\"” after down this tikv last for 20mins and recover tikv panic repeatedly with “\"[region 16697056] 19604003 applying snapshot failed\"” after down this tikv for 20mins and recover May 7, 2024
@LykxSassinator
Copy link
Contributor

LykxSassinator commented May 13, 2024

Dup with #15292, and I've built one pr #16992 for this issue but not been fully tested. If anyone else met this issue again, he / she could try this pr to check whether his issue can be tackled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants