2

What is a good way from a Scala or Java program to check if an S3 bucket has objects matching a certain key pattern? That is, if i have a bucket named "CsvBucket" how can i check if it contains an object where the key matches the pattern "processed/files/2015/8/*.csv" ?

Thanks

Sai
  • 3,819
  • 1
  • 25
  • 28
  • 1
    I'd say this http://codereview.stackexchange.com/questions/6847/list-objects-in-a-amazon-s3-folder-without-also-listing-objects-in-sub-folders and then run the results manually through a pattern filter. – zapl Sep 04 '15 at 21:29
  • 1
    @zapl code not yet written is off-topic – Caridorc Sep 04 '15 at 21:31
  • @Caridorc I've never heard of that rule. But I'm aware that I'm not supposed to answer questions in comments. Feel free to write a nice answer with written code and earn some points. – zapl Sep 04 '15 at 21:34
  • 1
    @zapl code not yet written is off-topic _on CodeReview_ I mean – Caridorc Sep 04 '15 at 21:38
  • 1
    @Caridorc I was just referring to the code as example of listing a directory. I don't want my comment to be reviewed :) – zapl Sep 04 '15 at 21:45

2 Answers2

2

Since S3 object keys are just Strings you can just iterate over them and test each using a regular expression. Perhaps something like this (using jets3t library):

Pattern pattern = Pattern.compile(".*\\.csv");
// 'service' is an instance of S3Service
S3Bucket bucket = service.getBucket(bucketName);
S3Object[] files = service.listObjects(bucket, "processed/files/2015/8", null);
for (int i = 0; i < files.length; i++)
{
    if (pattern.matches(files[i].getKey()))
    {
        // ... work with the file ...
    }
}
neuronaut
  • 2,689
  • 18
  • 24
1

Another way to do it - http://docs.aws.amazon.com/AmazonS3/latest/dev/ListingObjectKeysUsingJava.html

import java.io.IOException;
import com.amazonaws.AmazonClientException;
import com.amazonaws.AmazonServiceException;
import com.amazonaws.auth.profile.ProfileCredentialsProvider;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3Client;
import com.amazonaws.services.s3.model.ListObjectsRequest;
import com.amazonaws.services.s3.model.ListObjectsV2Request;
import com.amazonaws.services.s3.model.ListObjectsV2Result;
import com.amazonaws.services.s3.model.ObjectListing;
import com.amazonaws.services.s3.model.S3ObjectSummary;

public class ListKeys {
private static String bucketName = "***bucket name***";

public static void main(String[] args) throws IOException {
    AmazonS3 s3client = new AmazonS3Client(new ProfileCredentialsProvider());
    try {
        System.out.println("Listing objects");
        final ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucketName).withMaxKeys(2);
        ListObjectsV2Result result;
        do {               
           result = s3client.listObjectsV2(req);

           for (S3ObjectSummary objectSummary : 
               result.getObjectSummaries()) {
               System.out.println(" - " + objectSummary.getKey() + "  " +
                       "(size = " + objectSummary.getSize() + 
                       ")");
           }
           System.out.println("Next Continuation Token : " + result.getNextContinuationToken());
           req.setContinuationToken(result.getNextContinuationToken());
        } while(result.isTruncated() == true ); 

     } catch (AmazonServiceException ase) {
        System.out.println("Caught an AmazonServiceException, " +
                "which means your request made it " +
                "to Amazon S3, but was rejected with an error response " +
                "for some reason.");
        System.out.println("Error Message:    " + ase.getMessage());
        System.out.println("HTTP Status Code: " + ase.getStatusCode());
        System.out.println("AWS Error Code:   " + ase.getErrorCode());
        System.out.println("Error Type:       " + ase.getErrorType());
        System.out.println("Request ID:       " + ase.getRequestId());
    } catch (AmazonClientException ace) {
        System.out.println("Caught an AmazonClientException, " +
                "which means the client encountered " +
                "an internal error while trying to communicate" +
                " with S3, " +
                "such as not being able to access the network.");
        System.out.println("Error Message: " + ace.getMessage());
    }
}

}

barath
  • 762
  • 1
  • 8
  • 26