I am running my hadoop jobs on a cluster consisting of multiple machines whose sizes are not known (main memory, number of cores, size etc.. per machine). Without using any OS specific library (*.so files I mean), is there any class or tools for hadoop in itself or some additional libraries where I could collect information like while the Hadoop MR jobs are being executed:
- Total Number of cores / number of cores employed by the job
- Total available main memory / allocated available main memory
- Total Storage space on each machine/allocated storage space 4.
I don't have the hardware information or the specs of the cluster which is why I want to collect this kind of information programmatically in my hadoop code.
How can I achieve this? I want to know this kind of information because of different reasons. One reason is given by the following error: I want to know which machine ran out of space.
12/07/17 14:28:25 INFO mapred.JobClient: Task Id : attempt_201205221754_0208_m_001087_0, Status : FAILED
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/spill2.out
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:376)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:146)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite(LocalDirAllocator.java:127)
at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite(MapOutputFile.java:121)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1247)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1155)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:582)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.