Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid abseil containers with SeqManager compare functions #1369

Merged
merged 9 commits into from
Apr 9, 2024

Conversation

ibc
Copy link
Member

@ibc ibc commented Apr 9, 2024

Fixes #1366

Details

  • As demonstrated in temporal PR Add a test to check issue #1366 and use Debug mode #1368, standard C++ STD containers do not throw (even in debug mode) when using Compare functions in maps/sets that do not honor transitivity, i.e. comp(a,b) && comp(b,c) -> comp(a,c).
  • So let's not use abseil containers in those cases.

Bonus tracks

  • Dupicate CI actions in debug mode.
  • Fix an amazing bug in AudioLevelObserver.cpp which failed to compile because it uses a absl::btree_multimap without including the absl/container/btree_map.h header (it didn't fail before due to some absl header included by yet anothe included file, etc).

Fixes #1366

### Details

- As demonstrated in temporal PR #1368, standard C++ STD containers do not throw (even in debug mode) when using `Compare` functions in maps/sets that do not honor transitivity, i.e. `comp(a,b) && comp(b,c) -> comp(a,c)`.
- So let's not use abseil containers in those cases.

### Bonus tracks

- Dupicate CI actions in debug mode.
- Make mediasoup Rust building honor `MEDIASOUP_BUILDTYPE` env variable if given.
- Fix an amazing bug in `AudioLevelObserver.cpp` which failed to compile because it uses a `absl::btree_multimap` without including the `absl/container/btree_map.h` header (it didn't fail before due to some absl header included by yet anothe included file, etc).
@ibc ibc requested review from jmillan and nazar-pc April 9, 2024 09:33
@ibc ibc changed the title Avoid abseil containers with SeqManager compared functions Avoid abseil containers with SeqManager compare functions Apr 9, 2024
@ibc
Copy link
Member Author

ibc commented Apr 9, 2024

UPDATE: Fixed here: 3b23846

OMG... https://github.com/versatica/mediasoup/actions/runs/8613704210/job/23605569292?pr=1369

I'm reproducing it in local:

  • In worker/:
MEDIASOUP_BUILDTYPE=Debug make
  • Add { logLevel: 'warn' } to mediasoup.createWorker() in test-WebRtcServer.ts at the top of the file.

  • Run failing test:

MEDIASOUP_BUILDTYPE=Debug DEBUG="*mediasoup*" npx jest --testPathPattern node/src/test/test-WebRtcServer.ts

console.info
mediasoup:Channel Consumer Channel ended by the worker process +13ms

  at Logger.debug (node_modules/debug/src/common.js:113:10)

console.error
mediasoup:ERROR:Worker (stderr) [../../../subprojects/abseil-cpp-20230802.0/absl/container/internal/raw_hash_set.h : 1170] RAW: operator++ called on invalid iterator. The element might have been erased or the table might have rehashed. Consider running with --config=asan to diagnose rehashing issues. +0ms

  495 | 			for (const line of buffer.toString('utf8').split('\n')) {
  496 | 				if (line) {
> 497 | 					workerLogger.error(`(stderr) ${line}`);
      | 					             ^
  498 | 				}
  499 | 			}
  500 | 		});

  at Logger.debug (node_modules/debug/src/common.js:113:10)
  at Socket.<anonymous> (node/src/Worker.ts:497:19)

console.error
mediasoup:ERROR:Worker worker process died unexpectedly [pid:55327, code:null, signal:SIGABRT] +0ms

@ibc
Copy link
Member Author

ibc commented Apr 9, 2024

UPDATE: Fixed here: 3b23846

Ok, here the coredump:

Exception Type:        EXC_CRASH (SIGABRT)
Exception Codes:       0x0000000000000000, 0x0000000000000000

Termination Reason:    Namespace SIGNAL, Code 6 Abort trap: 6
Terminating Process:   mediasoup-worker [57642]

Application Specific Information:
abort() called


Thread 0 Crashed::  Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	    0x7ff815ee01e2 __pthread_kill + 10
1   libsystem_pthread.dylib       	    0x7ff815f17ee6 pthread_kill + 263
2   libsystem_c.dylib             	    0x7ff815e3eb45 abort + 123
3   mediasoup-worker              	       0x10ab5b8a6 absl::lts_20230802::raw_log_internal::(anonymous namespace)::RawLogVA(absl::lts_20230802::LogSeverity, char const*, int, char const*, __va_list_tag*) + 438 (raw_logging.cc:192)
4   mediasoup-worker              	       0x10ab5b6c4 absl::lts_20230802::raw_log_internal::RawLog(absl::lts_20230802::LogSeverity, char const*, int, char const*, ...) + 164 (raw_logging.cc:254)
5   mediasoup-worker              	       0x10ab5b93e absl::lts_20230802::raw_log_internal::(anonymous namespace)::DefaultInternalLog(absl::lts_20230802::LogSeverity, char const*, int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 94 (raw_logging.cc:202)
6   mediasoup-worker              	       0x10a8e58ce void absl::lts_20230802::base_internal::AtomicHook<void (*)(absl::lts_20230802::LogSeverity, char const*, int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&)>::operator()<absl::lts_20230802::LogSeverity, char const* const&, int, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>(absl::lts_20230802::LogSeverity&&, char const* const&, int&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>&&) const + 62 (atomic_hook.h:117)
7   mediasoup-worker              	       0x10a8e5801 absl::lts_20230802::container_internal::AssertIsFull(absl::lts_20230802::container_internal::ctrl_t const*, unsigned char, unsigned char const*, char const*) + 673 (raw_hash_set.h:1165)
8   mediasoup-worker              	       0x10aa8eba5 absl::lts_20230802::container_internal::raw_hash_set<absl::lts_20230802::container_internal::FlatHashSetPolicy<RTC::WebRtcTransport*>, absl::lts_20230802::container_internal::HashEq<RTC::WebRtcTransport*, void>::Hash, absl::lts_20230802::container_internal::HashEq<RTC::WebRtcTransport*, void>::Eq, std::__1::allocator<RTC::WebRtcTransport*>>::iterator::operator++() + 69 (raw_hash_set.h:1604)
9   mediasoup-worker              	       0x10aa8e99e RTC::WebRtcServer::~WebRtcServer() + 494 (WebRtcServer.cpp:247)
10  mediasoup-worker              	       0x10aa8ec85 RTC::WebRtcServer::~WebRtcServer() + 21 (WebRtcServer.cpp:232)
11  mediasoup-worker              	       0x10aa8ed29 RTC::WebRtcServer::~WebRtcServer() + 25 (WebRtcServer.cpp:232)
12  mediasoup-worker              	       0x10a8fd816 Worker::HandleRequest(Channel::ChannelRequest*) + 2054 (Worker.cpp:360)
13  mediasoup-worker              	       0x10a9260a9 Channel::ChannelSocket::OnConsumerSocketMessage(Channel::ConsumerSocket*, char*, unsigned long) + 153 (ChannelSocket.cpp:287)
14  mediasoup-worker              	       0x10a9265e1 Channel::ConsumerSocket::UserOnUnixStreamRead() + 161 (ChannelSocket.cpp:377)
15  mediasoup-worker              	       0x10a91c8ad UnixStreamSocketHandle::OnUvRead(long, uv_buf_t const*) + 77 (UnixStreamSocketHandle.cpp:363)
16  mediasoup-worker              	       0x10a91bafb onRead(uv_stream_s*, long, uv_buf_t const*) + 59 (UnixStreamSocketHandle.cpp:33)
17  mediasoup-worker              	       0x10aeea6ed uv__read + 1501 (stream.c:1143)
18  mediasoup-worker              	       0x10aee6aa3 uv__stream_io + 339 (stream.c:1203)
19  mediasoup-worker              	       0x10aef1761 uv__io_poll + 3729 (kqueue.c:385)
20  mediasoup-worker              	       0x10aed57f6 uv_run + 326 (core.c:448)
21  mediasoup-worker              	       0x10a8e7db5 DepLibUV::RunLoop() + 117 (DepLibUV.cpp:98)
22  mediasoup-worker              	       0x10a8fb01e Worker::Worker(Channel::ChannelSocket*) + 638 (Worker.cpp:56)
23  mediasoup-worker              	       0x10a8fb3cd Worker::Worker(Channel::ChannelSocket*) + 29 (Worker.cpp:21)
24  mediasoup-worker              	       0x10a8de6b5 mediasoup_worker_run + 1685 (lib.cpp:142)
25  mediasoup-worker              	       0x10ab5a0ff main + 239 (main.cpp:25)
26  dyld                          	    0x7ff815bbe41f start + 1903

It is in WebRtcServer destructor, I've added logs:

WebRtcServer::~WebRtcServer()
{
	MS_TRACE();
	MS_DUMP_STD("------ 1");

	this->shared->channelMessageRegistrator->UnregisterHandler(this->id);

	MS_DUMP_STD("------ 2");
	for (auto& item : this->udpSocketOrTcpServers)
	{
		delete item.udpSocket;
		item.udpSocket = nullptr;

		delete item.tcpServer;
		item.tcpServer = nullptr;
	}
	this->udpSocketOrTcpServers.clear();

	MS_DUMP_STD("------ 3");
	for (auto* webRtcTransport : this->webRtcTransports)
	{
		MS_DUMP_STD("------ 3.1");
		webRtcTransport->ListenServerClosed();
		MS_DUMP_STD("------ 3.2");
	}
	MS_DUMP_STD("------ 4");
	this->webRtcTransports.clear();

	MS_DUMP_STD("------ 5");
}
  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ 1 +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ 2 +1ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ 3 +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ 3.1 +1ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ 3.2 +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.error
      mediasoup:ERROR:Worker (stderr) [../../../subprojects/abseil-cpp-20230802.0/absl/container/internal/raw_hash_set.h : 1170] RAW: operator++ called on invalid iterator. The element might have been erased or the table might have rehashed. Consider running with --config=asan to diagnose rehashing issues. +0ms

@ibc
Copy link
Member Author

ibc commented Apr 9, 2024

UPDATE: Fixed here: 3b23846

Ok, I've added more logs and I see what it's happening:

      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ ~WebRtcServer 1 +45ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ ~WebRtcServer 2 +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ ~WebRtcServer 3 +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ ~WebRtcServer 3.1 +1ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::OnWebRtcTransportClosed() | ------ OnWebRtcTransportClosed +0ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.info
      mediasoup:Worker (stdout) RTC::WebRtcServer::~WebRtcServer() | ------ ~WebRtcServer 3.2 +1ms

      at Logger.debug (node_modules/debug/src/common.js:113:10)

  console.error
      mediasoup:ERROR:Worker (stderr) [../../../subprojects/abseil-cpp-20230802.0/absl/container/internal/raw_hash_set.h : 1170] RAW: operator++ called on invalid iterator. The element might have been erased or the table might have rehashed. Consider running with --config=asan to diagnose rehashing issues. +0ms

      495 | 			for (const line of buffer.toString('utf8').split('\n')) {
      496 | 				if (line) {
    > 497 | 					workerLogger.error(`(stderr) ${line}`);
          | 					             ^
      498 | 				}
      499 | 			}
      500 | 		});

      at Logger.debug (node_modules/debug/src/common.js:113:10)
      at Socket.<anonymous> (node/src/Worker.ts:497:19)

  console.error
      mediasoup:ERROR:Worker worker process died unexpectedly [pid:60740, code:null, signal:SIGABRT] +0ms
  • The WebRtcServer destructor iterates all WebRtcTransports in this->webRtcTransports set and calls webRtcTransport->ListenServerClosed() on them.
  • That method ends calling Router::OnTransportListenServerClosed() which calls delete transport.
  • The WebRtcTransport destructor ends calling this->webRtcTransportListener->OnWebRtcTransportClosed(this) which deletes the transport from the webRtcTransports set in WebRtcServer.
  • Then, the WebRtcServer destructor keeps iterating this->webRtcTransports (operator++ invoked internally) but at that point the set has already been modified!

@ibc
Copy link
Member Author

ibc commented Apr 9, 2024

I've fixed the bug here: 3b23846

@ibc ibc merged commit c0e3a62 into v3 Apr 9, 2024
51 checks passed
@ibc ibc deleted the do-not-use-abseil-with-seqmanager-compare branch April 9, 2024 13:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

SeqManager<T>::SeqLowerThan() crashes when used as compare function in abseil set or map (only in debug mode)
2 participants