7

Basic Problem:

We've been experiencing very strange behaviour in our current infrastructure setup:

  • file download speed from Amazon S3 drops to <2 kb/s (after ~10 downloads that have perfectly normal download speed) if the file is downloaded from the same IP/machine it has been uploaded from
  • on our other machines we can download the file a couple of thousand times and don't see this behaviour

Additional details:

  • the machines are setup identical using puppet
  • they are all virtual machines running ubuntu 12.04.4 on KVM with libvirtd on ubuntu 12.04.4 and 13.04 hosts
  • each VM has it's own public IP from which the traffic is originating
  • after a couple of minutes-hours it's possible to download the file again with >5 mb/s for a couple of times (seems to be 10 times)
  • files are uploaded from rails applications using the fog gem

Tests with wget:

Using wget, you see this output on the affected machines for a file we uploaded:

--2014-07-31 16:33:38--  http://s3-eu-west-1.amazonaws.com/not_the_real_file_url
Resolving s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)... 178.236.6.160
Connecting to s3-eu-west-1.amazonaws.com (s3-eu-west-1.amazonaws.com)|178.236.6.160|:80...      connected.
HTTP request sent, awaiting response... 200 OK
Length: 2801149 (2.7M) [text/plain]
Saving to: `/dev/null'

0% [                                ] 10,111      1.05K/s  eta 68m 26s

and it stays like this for 68m! (it does finish the download after that time though)

And this output for a random file hosted on amazon s3 by somebody else:

--2014-07-31 16:39:21--  https://s3.amazonaws.com/Minecraft.Download/versions/14w31a/minecraft_server.14w31a.jar
Resolving s3.amazonaws.com (s3.amazonaws.com)... 72.21.211.199
Connecting to s3.amazonaws.com (s3.amazonaws.com)|72.21.211.199|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10342238 (9.9M) [application/octet-stream]
Saving to: `/dev/null'

32% [====================================>    ] 3,370,945    747K/s  eta 12s

Our current workaround

Our current solution, is to use our HAProxy as a transparent HTTP proxy.

Meaning we have a frontend "cloud.example.com" defined, and a backend that first replaces the requests HOST with "s3-eu-west-1.amazonaws.com" and then uses s3-eu-west-1.amazonaws.com:80 as a server. For amazon it then looks like the request is coming from our proxy and we can download the files we stored on S3 thousands of times again. :)

[2014-07-31 16:56:57 +0200] RUN[28] AVG: '0.9612743812142854' s, LAST_RUN: '0.711118431' s
--2014-07-31 16:56:57--  https://cloud.example.com/not_the_real_file_url
Resolving cloud.example.com (cloud.example.com)... 1.2.3.4
Connecting to cloud.example.com (cloud.example.com)|1.2.3.4|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2801149 (2.7M) [text/plain]
Saving to: `/dev/null'

100%[====================>] 2,801,149   2.47M/s   in 1.1s
pangdudu
  • 391
  • 2
  • 6
  • 1
    Sounds kind of like some evilness on the part of your ISP :/ – Undo Jul 31 '14 at 15:07
  • We're hosted at: Hetzner (kind of big german provider) I'll ask them if they can explain the behaviour. We haven't had this kind of problems with our machines there up till now. – pangdudu Jul 31 '14 at 15:50
  • Ok, I've checked with our provider now. they ruled out, that it's a problem on their side. I guess it must have something to do with our server (ubuntu) setup. Disabling ufw didn't solve anything. Will post if I finally find the reason for this weird issue. – pangdudu Sep 15 '14 at 16:00

1 Answers1

3

Ok, solved it.

I'm still researching why this solved the issue, but here is what fixed it now:

As I described above, the behaviour occurs on an Ubuntu 12.04.5 KVM-Guest running on an Ubuntu 12.04.4 KVM-Host system. I took a look today, if we use different kernels (linux-image-*) on the guests (which can still happen since we're not provisioning them with puppet yet).

On KVM-guests where we have the strange <5 KB/s S3 download behaviour, we're using:

  • Linux 3.8.0-44-generic

On KVM-guests with >5 MB/s S3 download speed, we're using:

  • Linux 3.2.0-68-virtual (actually any *-virtual will solve this)

Hope this helps you if you ran into the same issue. I'll post more, if I truly understand why this happens.

Of course: You should use a *-virtual kernel on a VM-guest, I know. Why only S3 download is slow though kind of confuses me.

pangdudu
  • 391
  • 2
  • 6