1) The counters will be in the job history logs found at:
$LOG_PATH/$CLUSTER_ID/hadoop-mapreduce/history/$YEAR/$MONTH/$DAY/$JOB_ID.jhist.gz
They will be in JSON format so you may need to do some processing.
2) I would use the aws
or s3cmd
CLI tools to grab and process them.
You could also modify your hadoop jobs to write the counters to a file upon completion in whatever format you would like.
Something like:
//Rest of job setup
job.waitForCompletion(true);
FileSystem fs = FileSystem.get(URI.create(outputPath), job.getConfiguration());
FSDataOutputStream fsDataOutputStream = fs.create(new Path(outputPath + "/counters_output.csv"));
PrintWriter writer = new PrintWriter(fsDataOutputStream);
Counters counters = job.getCounters();
for (CounterGroup counterGroup : counters) {
for (Counter counter : counterGroup) {
writer.write(counter.getName() + "," + counter.getValue());
}
}
writer.close();
fsDataOutputStream.close();
fs.close();