Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow uploads with high latency #603

Open
erikdubbelboer opened this issue Nov 30, 2014 · 2 comments · May be fixed by #613
Open

Slow uploads with high latency #603

erikdubbelboer opened this issue Nov 30, 2014 · 2 comments · May be fixed by #613

Comments

@erikdubbelboer
Copy link
Contributor

I have noticed that uploads to disco get slower when the latency gets higher.

Since it's normal HTTP over TCP I was wondering if Disco does something special when reading from the TCP socket that could slow things down when the latency is higher. Or does Disco somehow makes the TCP receive buffer very small resulting in a lot of packet round tips.

The below tests are done using curl to make sure python wasn't the bottleneck. Originally I found this problem while using the ddfs tool.

This is a 100MB /dev/urandom file being transfered from a server in Salt Lake City to a server in Singapore. First I upload the file to nginx to show which speed is reachable. Then I upload the file to Disco:

$ curl -v -X POST -d @random.bin 'http://singapore-dev-1:80' > out.log
* Connected to singapore-dev-1 (119.81.66.224) port 80 (#0)
> POST / HTTP/1.1
> User-Agent: curl/7.35.0
> Host: singapore-dev-1
> Accept: */*
> Content-Length: 51983412
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
} [data not shown]
 84 49.5M    0     0   84 41.8M      0  6354k  0:00:07  0:00:06  0:00:01 8341k
< HTTP/1.1 200 OK
* Server nginx is not blacklisted
< Server: nginx
< Date: Sun, 30 Nov 2014 07:18:03 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Connection: close
< X-Powered-By: PHP/5.3.3
<
{ [data not shown]
100 49.5M    0     5  100 49.5M      0  6389k  0:00:07  0:00:07 --:--:-- 10.3M
* Closing connection 0

As you can see it's reaching 10.3MB/s.

But when I upload the same random file to disco I only get 0.2MB/s:

$ curl -v -X PUT -d @random.bin 'http://singapore-dev-1:8990/ddfs/test1$589-25437-55e0c' > out.log
* Connected to singapore-dev-1 (119.81.66.224) port 8990 (#0)
> PUT /ddfs/test1$589-25437-55e0c HTTP/1.1
> User-Agent: curl/7.35.0
> Host: singapore-dev-1:8990
> Accept: */*
> Content-Length: 51983412
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
* Server MochiWeb/1.0 (Any of you quaids got a smint?) is not blacklisted
< Server: MochiWeb/1.0 (Any of you quaids got a smint?)
< Date: Fri, 28 Nov 2014 05:31:25 GMT
} [data not shown]
100 49.5M    0     0  100 49.5M      0  40932  0:21:09  0:21:09 --:--:-- 0.3M
< HTTP/1.1 201 Created
* Server MochiWeb/1.0 (Any of you quaids got a smint?) is not blacklisted
< Server: MochiWeb/1.0 (Any of you quaids got a smint?)
< Date: Fri, 28 Nov 2014 05:52:35 GMT
< content-type: application/json
< Content-Length: 65
<
{ [data not shown]
100 49.5M    0    65  100 49.5M      0  40918  0:21:10  0:21:10 --:--:-- 0.2M
$
$ cat out.log
"disco://singapore-dev-1/ddfs/vol0/blob/43/test1$589-25437-55e0c"

The latency between the servers is as followed:

$ ping singapore-dev-1
PING singapore-dev-1 (119.81.66.224) 56(84) bytes of data.
64 bytes from singapore-dev-1 (119.81.66.224): icmp_seq=1 ttl=50 time=209 ms
...
64 bytes from singapore-dev-1 (119.81.66.224): icmp_seq=251 ttl=50 time=209 ms
--- singapore-dev-1 ping statistics ---
251 packets transmitted, 251 received, 0% packet loss, time 250119ms
rtt min/avg/max/mdev = 208.663/209.109/209.523/0.366 ms

When I do exactly the same but from the local host (or from a different server in the same datacenter) I get a very high speed again (meaning disco is not always slow):

$ curl -v -X PUT -d @random.bin 'http://singapore-dev-1:8990/ddfs/test2$589-26115-21b9a' > out.log
* About to connect() to singapore-dev-1 port 8990 (#0)
*   Trying 119.81.66.224... connected
* Connected to singapore-dev-1 (119.81.66.224) port 8990 (#0)
> PUT /ddfs/test2$589-26115-21b9a HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.16.1 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: singapore-dev-1:8990
> Accept: */*
> Content-Length: 51983412
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
< HTTP/1.1 100 Continue
< Server: MochiWeb/1.0 (Any of you quaids got a smint?)
< Date: Fri, 28 Nov 2014 06:25:41 GMT
} [data not shown]
< HTTP/1.1 201 Created
< Server: MochiWeb/1.0 (Any of you quaids got a smint?)
< Date: Fri, 28 Nov 2014 06:25:42 GMT
< content-type: application/json
< Content-Length: 65
<
{ [data not shown]
100 49.5M    0    65  100 49.5M     95  72.9M --:--:-- --:--:-- --:--:-- 76.5M
$
$ cat out.log
"disco://singapore-dev-1/ddfs/vol0/blob/a3/test2$589-26115-21b9a"

Both logs show the the upload succeeded normally:

2014-11-27 23:31:25.912 [info] <11312.166.0> PUT BLOB: "/ddfs/test1$589-25437-55e0c" ("51983412" bytes) on 'disco_8989_slave@singapore-dev-1'
2014-11-27 23:52:35.854 [info] <11312.166.0> PUT BLOB done with "/ddfs/test1$589-25437-55e0c" (51983412) on 'disco_8989_slave@singapore-dev-1'
...
2014-11-28 00:25:41.448 [info] <11312.52.0> PUT BLOB: "/ddfs/test2$589-26115-21b9a" ("51983412" bytes) on 'disco_8989_slave@singapore-dev-1'
2014-11-28 00:25:41.655 [info] <11312.52.0> PUT BLOB done with "/ddfs/test2$589-26115-21b9a" (51983412) on 'disco_8989_slave@singapore-dev-1'

When I try the same with a upload from a server in Amsterdam to a server in London I get a slower upload speed again. This time not as slow because the latency between the servers is lower:

$ curl -v -X PUT -d @random.bin 'http://london-2:8990/ddfs/test1$589-4f976-1c96' > out.log
* About to connect() to london-2 port 8990 (#0)
> PUT /ddfs/test1$589-4f976-1c96 HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: london-2:8990
> Accept: */*
> Content-Length: 25957455
> Content-Type: application/x-www-form-urlencoded
> Expect: 100-continue
>
< HTTP/1.1 100 Continue
} [data not shown]
 98 24.7M    0     0   98 24.3M      0   896k  0:00:28  0:00:27  0:00:01  0.9K
< HTTP/1.1 201 Created
< Server: MochiWeb/1.0 (Any of you quaids got a smint?)
< Date: Sun, 30 Nov 2014 05:41:25 GMT
< content-type: application/json
< Content-Length: 57
<
{ [data not shown]
100 24.7M  100    57  100 24.7M      1   888k  0:00:57  0:00:28  0:00:29  0.9M
* Connection #0 to host london-2 left intact
* Closing connection #0
$
$ ping london-2
PING london-2 (37.130.227.148) 56(84) bytes of data.
64 bytes from 2582e394.rdns.100tb.com (37.130.227.148): icmp_req=1 ttl=55 time=6.44 ms
...
64 bytes from 2582e394.rdns.100tb.com (37.130.227.148): icmp_req=35 ttl=55 time=6.31 ms
--- london-2 ping statistics ---
35 packets transmitted, 35 received, 0% packet loss, time 34003ms
rtt min/avg/max/mdev = 6.216/6.356/6.547/0.139 ms
@erikdubbelboer
Copy link
Contributor Author

I just tested nginx as proxy in front of the DDFS_PUT_PORT (without any nginx buffering). This way I get the full 10M/s speed again. For now this is a solution to my problem but I would prefer if this issue is fixed in Disco.

@jgrnt
Copy link

jgrnt commented Mar 5, 2015

I did some investigation on that issue, as it seems really limiting.
I setup local network shaping with http://github.com/lostcolony/damocles and could confirm the behavior. The very small and fixed tcp window size is problematic when the RTT is that high.

One has to change the socket options in mochiweb to get a comparable speed to a low latency link.

@jgrnt jgrnt linked a pull request Mar 16, 2015 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants