0

I have a bucket in S3 with the following structure and contents:

javaFolderA/
└── javaFolderB/
    └── javaFile.tmp
consoleFolderA/
└── consoleFolderB/
    └── consoleFile.tmp

The java* folders and file were uploaded via the Java SDK:

final File file = new File("C:\\javaFolderA\\javaFolderB\\javaFile.tmp");
client.putObject("testbucket", "javaFolderA/javaFolderB/javaFile.tmp", file);

The console* folders and file were created/uploaded from the web console (Clicking the "+ Create folder" button for each folder, then uploading the file with public read permissions).

In the web console, after clicking to create a new bucket, the following message is shown:

When you create a folder, S3 console creates an object with the above name appended by suffix "/" and that object is displayed as a folder in the S3 console.

So, as expected, with the folders and files above, we get 3 objects created in the bucket with the following keys:

  • consoleFolderA/
  • consoleFolderA/consoleFolderB/
  • consoleFolderA/consoleFolderB/consoleFile.tmp

Tthe result of the SDK upload is a single object with the key: javaFolderA/javaFolderB/javaFile.tmp. This makes sense, as we are only putting a single object, not three. However, this results in inconsistencies when listing the contents of a bucket. Even though there is only one actual file in each directory, listing the contents returns multiple for the console scenario.

My question is why is this the case, and how can I achieve consistent behavior? There doesn't seem to be a way to "upload a directory" via the SDK (In quotes because I know there aren't actually folders/directories).

From the CLI I can verify the number of objects and their keys:

C:\Users\avojak>aws s3api list-objects --bucket testbucket
{
    "Contents": [
        {
            "LastModified": "2018-01-02T22:43:55.000Z",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "StorageClass": "STANDARD",
            "Key": "consoleFolderA/",
            "Owner": {
                "DisplayName": "foo.bar",
                "ID": "2c401638471162eda7a3b48e41dfb9261d9022b56ce6b00c0ecf544b3e99ca93"
            },
            "Size": 0
        },
        {
            "LastModified": "2018-01-02T22:44:02.000Z",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "StorageClass": "STANDARD",
            "Key": "consoleFolderA/consoleFolderB/",
            "Owner": {
                "DisplayName": "foo.bar",
                "ID": "2c401638471162eda7a3b48e41dfb9261d9022b56ce6b00c0ecf544b3e99ca93"
            },
            "Size": 0
        },
        {
            "LastModified": "2018-01-02T22:44:16.000Z",
            "ETag": "\"968fe74fc49094990b0b5c42fc94de19\"",
            "StorageClass": "STANDARD",
            "Key": "consoleFolderA/consoleFolderB/consoleFile.tmp",
            "Owner": {
                "DisplayName": "foo.bar",
                "ID": "2c401638471162eda7a3b48e41dfb9261d9022b56ce6b00c0ecf544b3e99ca93"
            },
            "Size": 69014
        },
        {
            "LastModified": "2018-01-02T22:53:13.000Z",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "StorageClass": "STANDARD",
            "Key": "javaFolderA/javaFolderB/javaFile.tmp",
            "Owner": {
                "DisplayName": "foo.bar",
                "ID": "2c401638471162eda7a3b48e41dfb9261d9022b56ce6b00c0ecf544b3e99ca93"
            },
            "Size": 0
        }
    ]
}
avojak
  • 2,342
  • 2
  • 26
  • 32

1 Answers1

1

If you prefer the console implementation then you need to emulate it. That means that your SDK client needs to create the intermediate 'folders', when necessary. You can do this by creating zero-sized objects whose key ends in forward-slash (if that's your 'folder' separator).

The AWS console behaves this way, allowing you to create 'folders', because many AWS console users are more comfortable with the notion of folders and files than they are with objects (and keys).

It's rare, in my opinion, to need to do this, however. Your SDK clients should be implemented to handle both the presence and absence of these 'folders'. More info here.

jarmod
  • 71,565
  • 16
  • 115
  • 122
  • Thanks for the suggestion. I agree that I shouldn't have to go that route though. Definitely a frustrating nuance to have to work around as they appear identical from the web console. In my case I was hoping to use the `getObjectSummaries` and `getCommonPrefixes` methods on the `ObjectListing` to determine files vs. 'folders'. Sounds like it won't be quite so straightforward, and I'll just have to be more careful with how I make the `listObjects` requests and interpret the results. – avojak Jan 03 '18 at 05:32
  • 1
    Or maybe you could just call ListObjects. That would give you all objects, some of which are folders. Filter the results so that zero-sized objects ending with / get pushed into a folder hashmap/dictionary, then iterate over the remaining objects, use a path utility function to break the object keys into constituent folders, then push those into the folder hashmap/dictionary. That would give you a list of files separately from a map/dict of (real and implied) folders. – jarmod Jan 03 '18 at 15:43
  • Good idea - I like that. Thanks again! – avojak Jan 03 '18 at 16:01