I'm trying to run a hadoop job on AWS EMR that I execute locally using python on files in s3. I cannot seem to be able to access multiple files using *
. I want to be able to access all files from the folder 01 on. This code works on all files in this folder:
python mapper_reducer.py -r emr s3://firehose/2017/01/30/20/ --output-dir=s3://job-results
This code gets the error no matches found: s3://firehose/2017/01/*/*/
python mapper_reducer.py -r emr s3://firehose/2017/01/*/*/ --output-dir=s3://job-results
Is this an issue with mrjob? I tried adding a --recursive
flag with no result