Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic peer disconnect and idling data transfer intervals #1261

Open
kirillsc opened this issue Feb 12, 2024 · 3 comments
Open

Periodic peer disconnect and idling data transfer intervals #1261

kirillsc opened this issue Feb 12, 2024 · 3 comments

Comments

@kirillsc
Copy link

Hi All,

I am observing periodic interruptions in data transfer among peers that are using rtorrent. I am not sure if this is a bug, but I would appreciate if you could help me to understand the issue.

I am using rtorrent for file distribution within private network, specifically it is private subnets of a VPC on AWS. I run my own opensource bittorrent-tracker (https://github.com/webtorrent/bittorrent-tracker). My use case comprised of a single seeding server that needs to distribute a ~50GB folder among ~300 machines within the same private network. The seed server (1) creates the torrent file based on the local folder (2) shares this torrent file among all peers (3) each peer adds torrent file into the /rtorrent/watch/start/ folder and connects to the same tracker server (within the same network). All ~300 peers initiate downloading in about the same time.

The problem that I am repeatedly observing is prolonged periods of complete idling among all peers in the network. In other words, for the first ~5 minutes everything works as expected i.e., the seed server uploads at its maximum network bandwidth and all the peers receive pieces and also distribute chunks among themselves. Then after the initial ~5 minutes all peers disconnect from each other and idle for 5-10 minutes, eventually transfer resumes and lasts for another few minutes just to disconnect again. Two important observations: (1) if at the time of such idling, I manually restart the source seed server all transfer resumes for another few minutes, (2) the problem does not appear to be related just to the seed server only, because during the initial period of data transfer (before the first idling) there are many peers that manage to get the complete torrent file, however, none of them are sharing the data with the remaining peers during the idling period.

I am attaching log files from the seed server and one of the peer servers from one of my smaller scale experiments where I used only 20 peers.

In the log file I am seeing the following messages, but I am not sure about their relevance or how to debug them further. 



Handshake dropped: seeder rejected.
Received error: message:7 network error.
Upload unchoked slots adjust; currently:10 adjust:1

I am using rTorrent v0.9.8 and RHL8 OS.

I would appreciate any guidance on what could be an issue here.

Thank you.
server-log.log
client-log.log

@kannibalox
Copy link
Contributor

Did everything work as expected in the smaller test? If not, would you happen to have a log from a peer that didn't successfully get past the stall? Can you share your config?

Just to break down the log messages you mentioned a bit:

  • Handshake dropped: seeder rejected: This can happen two ways, one of which is only possible when using magnet links. The other is when rTorrent receives a connection from a seeder when it's also a seeder, so this message seems pretty harmless.
  • Received error: message:7 network error.: Unfortunately this can refer to a couple different kinds of network errors, and the master branch has more specific strings. There's plenty of reasons this could happen during normal operation, so it's good to have a timeline but otherwise doesn't tell much on it's own.
  • Upload unchoked slots adjust; currently:10 adjust:1: These are messages from the internal resource manager. By themselves, they're just informational messages telling you how many unchoked peer connections are active. Depending on your settings, it's unlikely but posssible rTorrent is clearing connections unecessarily

One funky thing I see in the logs that I don't think is normal is that within the space of second, rtorrent is starting an outgoing connection, receiving an incoming connection from the same host, then declaring that both connections received a network error. It's possible there's some weird race condition that happens in low latency networks. I assume all the clients are currently receiving the torrent at essentially the same time, would it be possible to try staggering the start across the servers?

@kirillsc
Copy link
Author

kirillsc commented Feb 14, 2024

Hi @kannibalox

Thanks for the quick response and explanation of the messages!

I was able to reproduce this issue using a single seeding server and a single client. I am attaching both logs and the configuration that was used. In this experiment the client experienced a stall in less than a minute after starting downloading the file.

To answer your last question, I am already spreading start up times across 20 seconds interval, however, the objective is to distribute files as fast as possible. I can artificially slow the process further (say by 1-2 minutes), but the issue is still present in the smallest scale tests.

Also, I don't want to diverge this conversation from the original topic, but I have also observed several times a case when a client shuts down half way through downloading a file. I have observed this when rtorrent client has been launched as a detached daemon process. I am attaching this log file as client_error2.log just in case it will make sense to you.

Let me know if I can provide any other debug information.

Thank you.
seed_server1.log
client1.log
config.log
client_error2.log

@kannibalox
Copy link
Contributor

Hm, 20 seconds would be enough to prevent the behavior I was thinking of, and there's not anything else obvious in the 1-on-1 logs. My interest is sufficiently piqued that I may see if I can replicate it. Are there any other noteworthy details about your setup?

As for client_error2.log, that looks like a normal shutdown procedure. Those can be triggered by SIGINT or SIGHUP, or by RPC calls, see https://rtorrent-docs.readthedocs.io/en/latest/cmd-ref.html#term-system-shutdown-normal for more infortmation. If rtorrent encountered an error it couldn't handle or something, it would have just crashed hard instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants