This is going to be an incomplete answer since I don't know python or boto, but I want to comment on the underlying concept in the question.
One of the other posters was right: there is no concept of a directory in S3. There are only flat key/value pairs. Many applications pretend certain delimiters indicate directory entries. For example "/" or "\". Some apps go as far as putting a dummy file in place so that if the "directory" empties out, you can still see it in list results.
You don't always have to pull your entire bucket down and do the filtering locally. S3 has a concept of a delimited list where you specific what you would deem your path delimiter ("/", "\", "|", "foobar", etc) and S3 will return virtual results to you, similar to what you want.
http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketGET.html (
Look at the delimiter header.)
This API will get you one level of directories. So if you had in your example:
mybucket/files/pdf/abc.pdf
mybucket/files/pdf/abc2.pdf
mybucket/files/pdf/abc3.pdf
mybucket/files/pdf/abc4.pdf
mybucket/files/pdf/new/
mybucket/files/pdf/new/abc.pdf
mybucket/files/pdf/2011/
And you passed in a LIST with prefix "" and delimiter "/", you'd get results:
mybucket/files/
If you passed in a LIST with prefix "mybucket/files/" and delimiter "/", you'd get results:
mybucket/files/pdf/
And if you passed in a LIST with prefix "mybucket/files/pdf/" and delimiter "/", you'd get results:
mybucket/files/pdf/abc.pdf
mybucket/files/pdf/abc2.pdf
mybucket/files/pdf/abc3.pdf
mybucket/files/pdf/abc4.pdf
mybucket/files/pdf/new/
mybucket/files/pdf/2011/
You'd be on your own at that point if you wanted to eliminate the pdf files themselves from the result set.
Now how you do this in python/boto I have no idea. Hopefully there's a way to pass through.