I am quite new to the use of big data tools like Hadoop. I want to execute a publicly available cluster trace (https://github.com/google/cluster-data) on Yarn/or Yarn Simulator.
One way to do is to feed input into Yarn via Gridmix.
The format in which Gridmix (https://hadoop.apache.org/docs/r2.8.3/hadoop-gridmix/GridMix.html) takes input is basically the output from Rumen. And Rumen (https://hadoop.apache.org/docs/r2.8.3/hadoop-rumen/Rumen.html) takes JobHistory log generated from a map-reduce cluster as input.
The google trace is not a map-reduce trace. However, I was wondering if I can transform it to the format same as what Grdimix takes as input, then I can use the Grdmix.
Can anyone here point me input format of Gridmix (Or output of Rumen)?
Or suggest me another way to do what I want to do?
Thanks.