0

I am using below command to retrieve HDFS quota but I dont want the fancy output. Instead I need this output to be stored in a comma or tab separated format. By default it is not a tab separated.. Can anyone suggest this?

Command:

hdfs dfs -count -q -h -v /path/to/directory

Output is like this:

    none             inf           250 G         114.9 G          518        2.8 K             45.0 G /new/directory/X

Expected Output:

none,inf,250 G,114.9 G,518,2.8 K,45.0 G,/new/directory/X

Yogesh
  • 47
  • 1
  • 10

1 Answers1

0

How about using sed. They key thing is to identify a unique string to identify the separator in the hdfs output. That could be tab since you said they are tab separated. But, the sample output you posted used spaces.

Once you decide on a unique string use sed to search for that unique string and replace it with a comma. It looks like two or more spaces are unique to field separation in the hdfs output in all cases but the start of the line and the path. Perhaps you can accept a leading comma and do a second pass of sed for the path.

This Stack Overflow question covers sed replacing consecutive spaces.

hdfs dfs -count -q  -h  -v /path/to/directory | sed -e "s/[[:space:]]\{2,\}/,/g" | sed -e "s/[[:space:]]\//,\//g"

The solution is even simpler if they are tabs.

hdfs | sed -e $'s/\t/,/g'
EricB
  • 468
  • 3
  • 9