-
-
Notifications
You must be signed in to change notification settings - Fork 732
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Operation timeout with creation of 6+ shards #2743
Comments
Assuming you are in a Linux environment, could you attempt to reproduce this in a Windows environment and let me know your results. @nikita-petko |
@DeclanFrampton these were all performed on Windows machines. |
Thats that idea out the window then, what are the system specs(probs not the issue, always good to have more info though) When you did a test with a different token for a singular server, did you use the same project? Also have you made any changes to the bot around the same time your the began? |
24 cores, 32GiB of physical memory. Windows Server 2019 Datacenter. 10GbE
|
Okay, since I can't debug this myself as I don't have 15k guilds to reproduce this, could you setup a new solution and use the same token. Use sharding/non shard and see if you still get the same issues. The bot doesn't need to have any features, just a fresh standalone build to test with. If you still continue to get the issue ill bring up your issue in the discord to see if we can get the priority raised on this issue. |
@DeclanFrampton this is now the error that is encountered depsite sharding being enabled.
|
I have discovered the final issue, the reason it failed to connect is due to not enough shards, and I have set it up to automatically fetch the shards now. Thank you for your help. |
Reopening as a new error has been encounted, after this error it will continously throw this error. It may be occuring in the GuildDownloader, but it also happens on my other test. After the 6th or 7th shard it will always throw this error.
|
I can confirm that this issue is persistent when shards are greater than or equal to 6. Once I get some free time later on today I will take a deeper dive into this unless someone else gets there first
|
Conducted some additional testing, it seems the shard that gets timed out usually reconnects directly after. Always get the exception with 7 shards and over. Can you confirm this is the case for you?
|
The shard will just continue to timeout, but will try to reconnect. But it causes all the other shards to fail |
Ok, I've asked a maintainer to see if he could take a look also. Meanwhile could you provide a full stacktrace yourself of the exception from debug. Thanks. |
@DeclanFrampton this is the full exception on a debug build. |
* Upgrade direct deps * Enable debug logging * Implement sharding fix. * Fix intents error * Add username filitering to Renders * Fix infinite respond error on slash command received * Temp fix for discord-net/Discord.Net#2743 * Deprecate Google Analytics * Use native Backtrace client instead * Remove in favour of Threading.Tasks * Migrate to autogenerated HTTP Client instead * Remove HTTP in favour of native HTTP clients (most of the contents is not used) * Use VaultSharp instead * Use Consul instead * Deprecate RbxUsersClient * Use VaultSharp and Consul in Configuration and Discord.Configuration * Not used anymore * Migrate healtcheck client to native HTTP, may need future tests. * Remove GAM references and move naming to BotRegistry * Remove non needed event handlers * Move to support of native AsyncWorkQueue * Update BotRegistry references and remove GAM references * Removal of RBXUSERS * Apply checks for IsReady * Fixtures on GSI try open, remove unneeded settings * REPL not used * Update to support newer VaultSharp * Assembly binding redirect nightmares * Possible fix: Change to 6.0.0 * Fix prod error: NRE in config * Removal of WCF * Migration of ADP to not use WCF or EventLog * Remove uneeded targets and always log full exception! * Move Logging in support of #101 and #217 * Reference new assembly, in support of #101 and #217 * Move to new Logging, supports #101 and #217 * Reference new logging: #101 and #217 * Update all references to Logging in support of #101 and #217 * Change from MFDLabs.Logging to Logging * Move targets to ./targets (#217) * Move targets within sln (#217) * Move scripts around (in support of #217) * Fix Backtrace Newtonsoft error (BINDINGS!) * Move to respect #217 * Remove Pipeline (#220) * Move to lib/shared (#110 and #217) * Remove pipeline: #220 * Fix production exec issue * Remove Sentinels in favour of Polly * Move Assemblies in support of #101 and #217 * Remove these component tests (not used) * MFDLabs.Grid.Bot -> Grid.Bot (#101) * MFDLabs.Grid.AutoDeployer -> Grid.AutoDeployer (#101) * FIX: Old settngs references * Update run-service scripts * Rename build configurations (#217) * Update assembly names in configs (#217, #101) * Config sanity change * Sanity change to unpackers! * Final closure of #101 * Remove assembly info and auto generate it * Fix simple case where webserver is not getting killed * Small sanity changes with naming conventions (#217) * Introduction into ServiceDiscovery and Redis core libraries * Add floodchecking! * Implement a refresh interval so it doesn't DDoS consul * Change from task.delay to thread.sleep * Log refresh interval * Remove check for lastIndex * Fix error in prod * Use IPv4 * Fix for localscript * Change counter registry provider * Fix argument deciding not to work * Add support for checking if slash commands exist or not * Support only deploying RC releases * Support process watchdog, support uploading to backtrace * Expose Logging settings publically * Add new sandbox code * Fix format string! * Remove type casting * Fix variadic type * Fix table pairs * Fix httpservice method names * Fix pcall * Fix assertion * Sanity changes to make things work with old RCC * Add fflag blacklist & config * replace with fflag get/set * Revert "replace with fflag get/set" This reverts commit 871c585. * Delete web server deployment utils The web server is now dedicated. * Rollout LuaVM v2.0 * Fix issue where commands not getting deregistered * Minor changes to shared libraries. Move floodcheckers out to shared registry to lower resource allocation. Remove some commands and update permission schemes of others. Implement Luau checks for ScriptExecution slash command! * Surplus Changes * Better handling for timeouts * Add Client Settings command! * Fix handling * Fox response handlers * DO NOT USE STRING! --------- Co-authored-by: nosyliam <liammeshor@gmail.com>
Check The Docs
Verify Issue Source
Check your intents
Description
Note: Please view edit history to see the original purpose of this issue.
If you create a sharded client with 6 or more shards, at around the 6th shard, all shards are prevented from running. The bot will continue to receive dispatches but will be unable to actually process the events (debug logging notes the dispatches but handlers are not being invoked).
This may relate to one of my old issues: #2126
Version
3.11.0
Working Version
No response
Logs
Sample
Instantiation
Packages
N/A
The text was updated successfully, but these errors were encountered: