1

I would like to upload a directory to S3 using the AWS Java SDK v2.

For example, how would I implement the following function?

fun uploadDirectory(bucket: String, prefix: String, directory: Path)

I would like the contents of directory to be replicated at s3://bucket/prefix/ on S3.

The v2 SDK documentation has an example for uploading a single object, but there doesn't seem to be an equivalent to this Upload a Directory example from v1.

Adam Millerchip
  • 20,844
  • 5
  • 51
  • 74

2 Answers2

1

You can implement it by using the following strategy:

  1. Use Files.walk to walk the directory, identifying all of the files.
  2. Asynchronously upload the files using the SDK, via S3AsyncClient.putObject.
  3. Use CompletableFuture.allOf to combine all of the upload tasks, and wait for completion.

This strategy uses the async client's default thread pool of 50 threads. This is working fine for me with directories that contain thousands of files.

The s3Prefix here is the prefix to add to each object uploaded to the bucket, equivalent to the target directory.

fun uploadDirectory(s3Bucket: String, s3Prefix: String, directory: Path) {
    require(directory.isDirectory())

    Files.walk(directory).use { stream ->
        stream.asSequence()
            .filter { it.isRegularFile() }
            .map { path ->
                putObject(
                    s3Bucket = s3Bucket,
                    s3Key = "$s3Prefix/${directory.relativize(path)}",
                    path = path
                )
            }
            .toList().toTypedArray()
    }.let { CompletableFuture.allOf(*it) }.join()
}

private fun putObject(s3Bucket: String, s3Key: String, path: Path)
    : CompletableFuture<PutObjectResponse> {
    val request = PutObjectRequest.builder()
        .bucket(s3Bucket)
        .key(s3Key)
        .build()

    return s3AsyncClient.putObject(request, path)
}
Adam Millerchip
  • 20,844
  • 5
  • 51
  • 74
0

TransferManager along with some other high level libraries is not yet available in v2. so you will have to use v1 instead, from the migration guide

High-level libraries, such as the Amazon S3 Transfer Manager and the Amazon SQS Client-side Buffering, are not yet available in version 2.x. See the AWS SDK for Java 2.x changelog for a complete list of libraries.

If your application depends on these libraries, see Using both SDKs side-by-side to learn how to configure your pom.xml to use both 1.x and 2.x. Refer to the AWS SDK for Java 2.x changelog for updates about these libraries.

mightyWOZ
  • 7,946
  • 3
  • 29
  • 46
  • There is a [preview](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/transfer-manager.html) of the transfer manager available, but it doesn't support uploading a directory yet. However, it's possible to do it without the transfer manager using asynchronous uploads, see my answer. – Adam Millerchip Sep 04 '21 at 06:19
  • @AdamMillerchip ofcourse its possible to iterate over the directory and put objects individually, but you were asking about a solution similar to the `TransferManager`, which as I stated is not yet available. – mightyWOZ Sep 04 '21 at 06:27
  • I asked how to write a function that uploads a directory to S3 using the AWS SDK v2, I didn't ask for a specific method. – Adam Millerchip Sep 04 '21 at 06:34
  • ok, so you knew about how to upload a single object using s3 v2, but you didn't know how to iterate over a directory. if that was the case then certainly you have learned a new thing about java sdk. I am happy for you – mightyWOZ Sep 04 '21 at 07:36