Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CREATE query core dump #1548

Open
dmthuc opened this issue Apr 4, 2024 · 8 comments
Open

CREATE query core dump #1548

dmthuc opened this issue Apr 4, 2024 · 8 comments
Assignees
Labels
bug Something isn't working can not reproduce

Comments

@dmthuc
Copy link
Collaborator

dmthuc commented Apr 4, 2024

2024.04.03 12:04:23.270525 [ 74 ] {} <Trace> BaseDaemon: Received signal 11
2024.04.03 12:04:23.270761 [ 1222 ] {} <Fatal> BaseDaemon: ########################################
2024.04.03 12:04:23.270800 [ 1222 ] {} <Fatal> BaseDaemon: (version 21.8.7.1 scm1.0.0.0, build id: 302B9E5F6FD9144E) (from thread 315) (no query) Received signal Segmentation fault (11)
2024.04.03 12:04:23.270820 [ 1222 ] {} <Fatal> BaseDaemon: Address: NULL pointer. Access: read. Address not mapped to object.
2024.04.03 12:04:23.270831 [ 1222 ] {} <Fatal> BaseDaemon: Stack trace: 0x2768b1d0 0x22e31294 0x22e3040b 0x22e2fcee 0x22e3067e 0x22e30762 0x1ffb6d39 0x1ffa5ba0 0x107f1231 0x107f2d48 0x107ed940 0x107f1dba 0x7f6b02235fa3 0x7f6b00d3f4cf
2024.04.03 12:04:23.279664 [ 1222 ] {} <Fatal> BaseDaemon: 3. memcpy @ 0x2768b1d0 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.287962 [ 1222 ] {} <Fatal> BaseDaemon: 4. DB::(anonymous namespace)::writeQueryAroundTheError(DB::WriteBuffer&, char const*, char const*, bool, DB::Token const*, unsigned long) @ 0x22e31294 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.296241 [ 1222 ] {} <Fatal> BaseDaemon: 5. DB::(anonymous namespace)::getSyntaxErrorMessage(char const*, char const*, DB::Token, DB::Expected const&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) @ 0x22e3040b in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.304460 [ 1222 ] {} <Fatal> BaseDaemon: 6. DB::tryParseQuery(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) @ 0x22e2fcee in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.312696 [ 1222 ] {} <Fatal> BaseDaemon: 7. DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) @ 0x22e3067e in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.320890 [ 1222 ] {} <Fatal> BaseDaemon: 8. DB::parseQuery(DB::IParser&, char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) @ 0x22e30762 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.329130 [ 1222 ] {} <Fatal> BaseDaemon: 9. DB::CnchWorkerResource::executeCreateQuery(std::__1::shared_ptr<DB::Context>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, DB::ColumnsDescription const&) @ 0x1ffb6d39 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.337435 [ 1222 ] {} <Fatal> BaseDaemon: 10. DB::CnchWorkerServiceImpl::sendResources(google::protobuf::RpcController*, DB::Protos::SendResourcesReq const*, DB::Protos::SendResourcesResp*, google::protobuf::Closure*)::$_13::operator()() const @ 0x1ffa5ba0 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.345758 [ 1222 ] {} <Fatal> BaseDaemon: 11. ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) @ 0x107f1231 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.354096 [ 1222 ] {} <Fatal> BaseDaemon: 12. ThreadFromGlobalPool::ThreadFromGlobalPool<void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()>(void&&, void ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()&&...)::'lambda'()::operator()() @ 0x107f2d48 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.362422 [ 1222 ] {} <Fatal> BaseDaemon: 13. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x107ed940 in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.370717 [ 1222 ] {} <Fatal> BaseDaemon: 14. void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda0'()> >(void*) @ 0x107f1dba in /opt/byconity/bin/clickhouse
2024.04.03 12:04:23.370750 [ 1222 ] {} <Fatal> BaseDaemon: 15. start_thread @ 0x7fa3 in /lib/x86_64-linux-gnu/libpthread-2.28.so
2024.04.03 12:04:23.370779 [ 1222 ] {} <Fatal> BaseDaemon: 16. __clone @ 0xf94cf in /lib/x86_64-linux-gnu/libc-2.28.so
2024.04.03 12:04:23.516800 [ 1222 ] {} <Fatal> BaseDaemon: Calculated checksum of the binary: F7C4732B3FC960092DCE81DBCD4AF821. There is no information about the reference checksum.
2024.04.03 12:04:44.690316 [ 74 ] {} <Trace> BaseDaemon: Received signal -1
2024.04.03 12:04:44.690341 [ 74 ] {} <Fatal> BaseDaemon: (version 21.8.7.1 scm 1.0.0.0, build id: 302B9E5F6FD9144E) (from thread 315) (no query) generate core minidump path: /var/log/byconity//a66814d6-3460-42b1-7a230ab8-789f1154.dmp

relevent log

2024.04.03 12:04:23.270226 [ 315 ] {} <Debug> WorkerResource: start create cloud table CREATE TABLE test_gipxk2.people_448825934011957253 UUID '46b8dd72-92b4-43af-81c8-63fd655d346c'
(
    `id` Int32,
    `name` String,
    `company_id` Int32,
    `state_id` Int32,
    `city_id` Int32
)
ENGINE = CloudMergeTree(test_gipxk2, people)
PARTITION BY id
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192, storage_policy = 'cnch_default_hdfs', cnch_temporary_table = 1
SETTINGS load_balancing = 'random', distributed_aggregation_memory_efficient = 0, log_queries = 1, distributed_product_mode = 'global', join_use_nulls = 0, max_execution_time = 180, log_comment = '40007_outer_join_to_inner_join.sql', send_logs_level = 'warning', data_type_default_nullable = 0, slow_query_ms = 0, max_rows_to_schedule_merge = 500000000, total_rows_to_schedule_merge = 0, strict_rows_to_schedule_merge = 50000000, enable_merge_scheduler = 0, cnch_max_cached_storage = 50000, enable_optimizer = 1, enable_optimizer_fallback = 0, exchange_timeout_ms = 300000, bsp_mode = 1

#0  0x000000002768b1d0 in memcpy ()
[Current thread is 78949 (LWP 315)]
(gdb) bt
#0  0x000000002768b1d0 in memcpy ()
#1  0x0000000022e31294 in DB::(anonymous namespace)::writeQueryAroundTheError(DB::WriteBuffer&, char const*, char const*, bool, DB::Token const*, unsigned long) ()
#2  0x0000000022e3040b in DB::(anonymous namespace)::getSyntaxErrorMessage(char const*, char const*, DB::Token, DB::Expected const&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) ()
#3  0x0000000022e2fcee in DB::tryParseQuery(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) ()
#4  0x0000000022e3067e in DB::parseQueryAndMovePosition(DB::IParser&, char const*&, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, unsigned long, unsigned long) ()
#5  0x0000000022e30762 in DB::parseQuery(DB::IParser&, char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, unsigned long) ()
#6  0x000000001ffb6d39 in DB::CnchWorkerResource::executeCreateQuery(std::__1::shared_ptr<DB::Context>, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool, DB::ColumnsDescription const&) ()
#7  0x000000001ffa5ba0 in DB::CnchWorkerServiceImpl::sendResources(google::protobuf::RpcController*, DB::Protos::SendResourcesReq const*, DB::Protos::SendResourcesResp*, google::protobuf::Closure*)::$_13::operator()() const ()
#8  0x00000000107f1231 in ThreadPoolImpl<ThreadFromGlobalPool>::worker(std::__1::__list_iterator<ThreadFromGlobalPool, void*>) ()
#9  0x00000000107f2d48 in ThreadFromGlobalPool::ThreadFromGlobalPool<ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::{lambda()#2}>(ThreadPoolImpl<ThreadFromGlobalPool>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::{lambda()#2}&&)::{lambda()#1}::operator()() ()
#10 0x00000000107ed940 in ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
    ()
#11 0x00000000107f1dba in void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::{lambda()#2}> >(void*) ()
#12 0x00007f6b02235fa3 in allocate_stack (stack=<synthetic pointer>, pdp=<synthetic pointer>, attr=0x7f6a8a9fe9ee)
--Type <RET> for more, q to quit, c to continue without paging--
    at allocatestack.c:511
#13 __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>,
    arg=<optimized out>) at pthread_create.c:660
Backtrace stopped: Cannot access memory at address 0x8
@dmthuc dmthuc added the bug Something isn't working label Apr 4, 2024
@nudles
Copy link
Collaborator

nudles commented Apr 5, 2024

Hi @dmthuc do you mean the following sql cause the error?

CREATE TABLE test_gipxk2.people_448825934011957253 UUID '46b8dd72-92b4-43af-81c8-63fd655d346c'
(
    `id` Int32,
    `name` String,
    `company_id` Int32,
    `state_id` Int32,
    `city_id` Int32
)
ENGINE = CloudMergeTree(test_gipxk2, people)
PARTITION BY id
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192, storage_policy = 'cnch_default_hdfs', cnch_temporary_table = 1
SETTINGS load_balancing = 'random', distributed_aggregation_memory_efficient = 0, log_queries = 1, distributed_product_mode = 'global', join_use_nulls = 0, max_execution_time = 180, log_comment = '40007_outer_join_to_inner_join.sql', send_logs_level = 'warning', data_type_default_nullable = 0, slow_query_ms = 0, max_rows_to_schedule_merge = 500000000, total_rows_to_schedule_merge = 0, strict_rows_to_schedule_merge = 50000000, enable_merge_scheduler = 0, cnch_max_cached_storage = 50000, enable_optimizer = 1, enable_optimizer_fallback = 0, exchange_timeout_ms = 300000, bsp_mode = 1

@dmthuc
Copy link
Collaborator Author

dmthuc commented Apr 5, 2024

Hi @dmthuc do you mean the following sql cause the error?

CREATE TABLE test_gipxk2.people_448825934011957253 UUID '46b8dd72-92b4-43af-81c8-63fd655d346c'
(
    `id` Int32,
    `name` String,
    `company_id` Int32,
    `state_id` Int32,
    `city_id` Int32
)
ENGINE = CloudMergeTree(test_gipxk2, people)
PARTITION BY id
PRIMARY KEY id
ORDER BY id
SETTINGS index_granularity = 8192, storage_policy = 'cnch_default_hdfs', cnch_temporary_table = 1
SETTINGS load_balancing = 'random', distributed_aggregation_memory_efficient = 0, log_queries = 1, distributed_product_mode = 'global', join_use_nulls = 0, max_execution_time = 180, log_comment = '40007_outer_join_to_inner_join.sql', send_logs_level = 'warning', data_type_default_nullable = 0, slow_query_ms = 0, max_rows_to_schedule_merge = 500000000, total_rows_to_schedule_merge = 0, strict_rows_to_schedule_merge = 50000000, enable_merge_scheduler = 0, cnch_max_cached_storage = 50000, enable_optimizer = 1, enable_optimizer_fallback = 0, exchange_timeout_ms = 300000, bsp_mode = 1

yeah

@nudles
Copy link
Collaborator

nudles commented Apr 8, 2024

@dmthuc do you have the original sql? the sql above is executed on the workers? I am trying to reproduce the error.

@dmthuc
Copy link
Collaborator Author

dmthuc commented Apr 9, 2024

@dmthuc do you have the original sql? the sql above is executed on the workers? I am trying to reproduce the error.

yes, the original sql is in the test case
tests/queries/4_cnch_stateless/40007_outer_join_to_inner_join.sql

select x.p_id, x.p_comp_id, x.p_c_id, x.p_s_id, x.comp_id, y.s_id, y.c_id, y.c_s_id from ( select p.id as p_id, p.company_id as p_comp_id, p.city_id as p_c_id, p.state_id as p_s_id, comp.id as comp_id from company comp left join people p on p.company_id = comp.id ) x join ( select s.id as s_id, c.id as c_id, c.state_id as c_s_id from state s full outer join city c on s.id = c.state_id ) y on x.p_s_id = y.s_id and x.p_c_id = y.c_id order by x.p_id, x.p_comp_id, x.p_c_id, x.p_s_id, x.comp_id, y.s_id, y.c_id, y.c_s_id;

@nudles
Copy link
Collaborator

nudles commented Apr 9, 2024

@dmthuc can you reproduce it?
I just tried it. No errors...

@dmthuc
Copy link
Collaborator Author

dmthuc commented Apr 9, 2024

@dmthuc can you reproduce it? I just tried it. No errors...

No, I only encounter this one time.

@jenrryyou
Copy link
Contributor

@dmthuc I can't reproduce it. I find a settings bsp_mode = 1, do we run ci with bsp mode enable?

@dmthuc
Copy link
Collaborator Author

dmthuc commented Apr 22, 2024

@dmthuc I can't reproduce it. I find a settings bsp_mode = 1, do we run ci with bsp mode enable?

Yeah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working can not reproduce
Projects
None yet
Development

No branches or pull requests

5 participants