0

Does anyone know how to limit the max size of s3 output files (part-r-00000, part-r-00001 ... etc) from mrjob?

I'm compressing the output if that makes any difference using the following in my .mrjob.conf file:

jobconf:
 mapred.output.compress: 'true'
 mapred.output.compression.codec: org.apache.hadoop.io.compress.GzipCodec

Thanks in advance, Conor

Digan
  • 23
  • 4

0 Answers0