2

I have a simple alpakka s3 file downloading on top of Play framework 2.8, the code is like this:

in S3 service:

def download(link: String): Source[Option[(Source[ByteString, NotUsed], ObjectMetadata)], NotUsed] = {
    S3.download(TraktrainBucket.DOWNLOAD_BUCKET, link)
}

and in a controller:

          val source = s3Service.download(link).map(s => s.map(_._1))
          val trackName = "track name"
          val filename = trackName.replaceAll("[^A-Za-z0-9 \\-.]", "") + (if (track.drumKit) ".zip" else ".mp3")
          val disposition = "attachment; filename=\"" + filename + "\""
          Result(
            header = ResponseHeader(200, Map("Content-Disposition" -> disposition)),
            body = HttpEntity.Streamed(source.flatMapConcat(_.getOrElse(Source.empty)), None, Some("application/octet-stream"))
          )

Also I've an upload thing (it takes an mp3 file, process it with ffmpeg and uploads it to s3 like that:

def richUploadMp3(extension: String, checkFunction: (String, Option[String]) => Boolean, cannedAcl: CannedAcl, bucket: String) = producerAction(parse.multipartFormData(handleFilePartAsFile)).async { implicit request =>
    val s3Filename = request.user.get.id + "/" + java.util.UUID.randomUUID.toString + "." + extension
    val s3FilenameTagged = request.user.get.id + "/" + java.util.UUID.randomUUID.toString + "." + extension
    val fileOption = request.body.file("file").map {
      case FilePart(key, filename, contentType, file, _, _) =>
        logger.info(s"key = ${key}, filename = ${filename}, contentType = ${contentType}, file = $file")
        if(checkFunction(filename, contentType)) {
          val taggedFile = audioService.putTag(file)
          for {
            mp3 <- FileIO.fromPath(file.toPath).runWith(s3Service.uploadSink(s3Filename, cannedAcl, TraktrainBucket.DOWNLOAD_BUCKET))
            mp3Tagged <- FileIO.fromPath(taggedFile.toPath).runWith(s3Service.uploadSink(s3FilenameTagged, cannedAcl, TraktrainBucket.STREAMING_BUCKET))
          } yield (mp3, mp3Tagged, file, taggedFile)
        } else {
          throw new Exception("Upload failed")
        }
    }
    fileOption match {
      case Some(opt) => opt.map(o => {
        o._3.delete()
        o._4.delete()
        Ok(Json.toJson(Seq(s3Filename, s3FilenameTagged)))
      })
      case _ => Future.successful(BadRequest("ERROR"))
    }
  }

And it works fine for some time but after like 2 days It starts throwing this error:

exceeded configured max-open-requests value of 1024

and it's not getting away, it seems akka just open requests and do not close it

My akka-http conf looks like this:

akka {
  loglevel = DEBUG

  http {
    client {
      connecting-timeout = 5 s
      idle-timeout = 5 s
      parsing {
        max-content-length = 3000m
      }
    }
    server {
      parsing {
        max-content-length = 3000m
      }
    }

    host-connection-pool {
      max-open-requests = 1024
      idle-timeout = 10 s
      client {
        connecting-timeout = 10 s
        idle-timeout = 10 s
      }
    }
  }
}

and sometimes I see such thigs in my logs:

Response stream for [GET /free/642241] failed with 'TCP idle-timeout encountered on connection to [s3.us-west-2.amazonaws.com:443], no bytes passed in the last 10 seconds'. Aborting connection.

what's the problem here not closing connections? How could I monitor it? I didn't even find any way to track open-requests over the time, and how could I fix it?

My alpakka in build.sbt looks like this:

val AkkaVersion = "2.5.31"
val AkkaHttpVersion = "10.1.12"
libraryDependencies ++= Seq(
  "com.lightbend.akka" %% "akka-stream-alpakka-s3" % "2.0.1",
  "com.typesafe.akka" %% "akka-stream" % AkkaVersion,
  "com.typesafe.akka" %% "akka-http" % AkkaHttpVersion,
  "com.typesafe.akka" %% "akka-http-xml" % AkkaHttpVersion
)
cutoffurmind
  • 497
  • 8
  • 19
  • What resolves it at the moment? Restarting the server? After a restart do you see the `TCP idle-timeout encountered` error? How many such errors do you see until it gets stuck? What else is your machine doing? – Tomer Shetah Aug 06 '20 at 15:40
  • Another thing, on the error you attached, there is a reference to docs, that might explain some of the issues you are experiencing: http://doc.akka.io/docs/akka-http/current/scala/http/client-side/pool-overflow.html – Tomer Shetah Aug 06 '20 at 16:38
  • Yes, only server restart. There was about 700 'TCP idle-timeout encountered on connection' errors and about 250 another: Response stream for [GET /free/556601] failed with 'Entity stream truncation. The HTTP parser was receiving an entity when the underlying connection was closed unexpectedly.'. Aborting connection. akka.http.scaladsl.model.EntityStreamException: Entity stream truncation. The HTTP parser was receiving an entity when the underlying connection was closed unexpectedly. That's with 128 connections pool – cutoffurmind Aug 07 '20 at 09:13
  • It also handles regular lay Framework requests, but it's not a highload project, we have like 10k users everyday, and download endpoints called maybe 1-3 times every minute so it's not under the pressure – cutoffurmind Aug 07 '20 at 09:17

1 Answers1

1

Check out the settings in your Play settings made for the backend Akka HTTP server. If your requestTimeout is set to infinite, which is the default, change that to a reasonable time limit for your app. This may cause a long running connections when things goes wrong in a connection. As noted in below, these configurations overrides Akka configurations.

https://www.playframework.com/documentation/2.8.x/SettingsAkkaHttp

Note: Akka HTTP has a number of timeouts configurations that you can use to protect your application from attacks or programming mistakes. The Akka HTTP Server in Play will automatically recognize all these Akka configurations. For example, if you have idle-timeout and request-timeout configurations like below: akka.http.server.idle-timeout = 20s akka.http.server.request-timeout = 30s They will be automatically recognized. Keep in mind that Play configurations listed above will override the Akka ones.

Try using netstat -tnp command in Linux to see all current open TCP connections. If you have a test environment, you can start clean and test one functionality (and 1 file) at a time and see which one is causing long running connections. Normally, after the file processing is completed by S3, after 10 seconds or so the connection should idle-out by Akka HTTP per your settings (hence the TCP idle-timeout... message you mentioned in your post). If idle-out works as expected for a feature after multiple tries, switch to another feature to test. If this works out well for all of your features that utilizes Akka HTTP, then you need to consider if max-open-requests is sufficient for your production environment. This is dependent on how many request you receive and how fast you can process in a given time period.

Also, Akka HTTP is a 'streaming all the way though' library and thus it expects the data (HttpEntity) to be consumed before sending more. So, sending just an 200 OK is not enough. Lack of that will cause back-pressure and thus cause new connections to be made until the max limit max-open-requests. This could be your situation. Check this out

https://doc.akka.io/docs/akka-http/current/implications-of-streaming-http-entity.html

Common causes of pool overload:

From https://doc.akka.io/docs/akka-http/current/client-side/pool-overflow.html?language=scala

Common causes of pool overload As explained above the general explanation for pool overload is that the incoming request rate is higher that the request processing rate. This can have all kinds of causes (and hints for fixing them in parentheses):

  • The server is too slow (improve server performance)
  • The network is too slow (improve network performance)
  • The client issues requests too fast (slow down creation of requests if possible)
  • There’s high latency between client and server (use more concurrent connections to side latency with parallelism)
  • There are peaks in the request rate (prevent peaks by tuning the client application or increase max-open-requests to buffer short-term peaks)
  • Response entities were not read or discarded (see Implications of the streaming nature of Http entities)
  • Some requests are slower than others blocking the connections of a pool for other requests (see below)

The last point may need a bit more explanation. If some requests are much slower than others, e.g. if the request is a long-running Server Sent Events request than this will block one of the connections of the pool for a long time. If there are multiple such requests going on at the same time it will lead to starvation and other requests cannot make any progress any more. Make sure to run a long-running request on a dedicated connection (using the Connection-Level Client-Side API) to prevent such a situation.

Why does this happen only with Akka Http and not with [insert other client]

Many Java HTTP clients don’t set limits by default for some of the resources used. E.g. some clients will never queue a request but will just open another connection to the server if all the pooled connections are currently busy. However, this might just move the problem from the client to the server. Also using an excessive number of connections will lead to worse performance on the network as more connections will compete for bandwidth.

Also review Akka HTTP's example page for S3

https://doc.akka.io/docs/alpakka/current/s3.html

Check out timeouts in Akka HTTP

https://doc.akka.io/docs/akka-http/current/common/timeouts.html

You can see the detailed descriptions of each of the confid options here

https://doc.akka.io/docs/akka-http/current/configuration.html

For example, here is the description request-timeout

# Defines the default time period within which the application has to
# produce an HttpResponse for any given HttpRequest it received.
# The timeout begins to run when the *end* of the request has been
# received, so even potentially long uploads can have a short timeout.
# Set to `infinite` to completely disable request timeout checking.
#
# Make sure this timeout is smaller than the idle-timeout, otherwise,
# the idle-timeout will kick in first and reset the TCP connection
# without a response.
#
# If this setting is not `infinite` the HTTP server layer attaches a
# `Timeout-Access` header to the request, which enables programmatic
# customization of the timeout period and timeout response for each
# request individually.
request-timeout = 20 s
K4M
  • 1,030
  • 3
  • 11
  • thanks, I'll try, does requestTimeout affect big files downloads? Will it frop any requests over akka.http.server.request-timeout = 30s config? – cutoffurmind Aug 08 '20 at 21:13
  • @cutoffurmind the clock starts after the `request` is completely received and timesout when the time is reached but no `response` produced. i updated the answer, see last link and below in the answer for details – K4M Aug 08 '20 at 21:46
  • ok, trying it, also I put None for content-length here HttpEntity.Streamed(source.flatMapConcat(_.getOrElse(Source.empty)), None, Some("application/octet-stream")). Should I set it or is it fine in terms of akka streaming? – cutoffurmind Aug 10 '20 at 07:32
  • I set request-timeout to 30s and it made things worth, it started to overflow every 2-3 hours – cutoffurmind Aug 11 '20 at 04:40
  • @cutoffurmind you should provide the content-length, if known. https://www.playframework.com/documentation/2.7.0-M4/api/java/play/http/HttpEntity.Streamed.html – K4M Aug 12 '20 at 16:18
  • @cutoffurmind I would suggest to change one thing a time to troubleshoot this. If you change multiple things at once, it would be harder to diagnose – K4M Aug 12 '20 at 16:19
  • I gave up debuging it, I just rewrote it using AWS Java SDK and it works fine now – cutoffurmind Aug 24 '20 at 15:58