0

Hi I have few files in hdfs , now I have to extract the files in specific range . How can I do that using unix grep command?

My hdfs looks like this:

-rw-rw-r--   3 pscore hdpdevs      94461 2014-12-10 02:08 /data/bus/pharma/shared/purch/availability_alert/proc/2014-12-10_02-07-12-0    
-rw-rw-r--   3 pscore hdpdevs     974422 2014-12-11 02:08 /data/bus/pharma/shared/purch/availability_alert/proc/2014-12-11_02-07-10-0    
-rw-rw-r--   3 pscore hdpdevs      32854 2014-12-11 02:08 /data/bus/pharma/shared/purch/availability_alert/proc/2014-12-11_02-07-16-0    
-rw-rw-r--   3 pscore hdpdevs    1936753 2014-12-12 02:07 /data/bus/pharma/shared/purch/availability_alert/proc/2014-12-12_02-06-04-0    
-rw-rw-r--   3 pscore hdpdevs      79365 2014-12-12 02:07 /data/bus/pharma/shared/purch/availability_alert/proc/2014-12-12_02-06-11-0

I want to extract the files from 2014-12-11 09:00 to 2014-12-12 09:00. I tried using hadoop fs -ls /dabc | sed -n '/2014-12-11 09:00/ , /2014-12-12 09:00/p' but that does'nt work . Any help? I want to use grep command for this

Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
Neethu Lalitha
  • 3,031
  • 4
  • 35
  • 60
  • Can you specify what is the needed time range ? – Gilles Quénot Dec 12 '14 at 15:14
  • I want to extract files from yesterday morning 9:00 am to today morning 9:00 am . ie 2014-12-11 09:00 to 2014-12-12 09:00 – Neethu Lalitha Dec 12 '14 at 15:15
  • 1
    Can't use find command to file those files? like [this](http://unix.stackexchange.com/questions/29245/how-to-list-files-that-were-changed-in-a-certain-range-of-time) – Sas Dec 12 '14 at 15:58
  • grep is the wrong tool for this: can you [`stat`](http://man.cx/stat) the files to extract their mtime? – glenn jackman Dec 12 '14 at 16:15
  • the way you were trying, will work if you'd have had Date/Time columns as the first and second column + sorted. then sed would have worked. – AKS Dec 12 '14 at 16:16
  • I tried `find hadoop fs -ls /data/bus/pharma/shared/purch/availability_alert/proc -newer start \! -newer stop` but it shows the following error: `find: paths must precede expression: /data/bus/pharma/shared/purch/availability_alert/proc/ Usage: find [-H] [-L] [-P] [-Olevel] [-D help|tree|search|stat|rates|opt|exec] [path...] [expression]` – Neethu Lalitha Dec 12 '14 at 16:25
  • Your passing in command ("hadoop fs -ls") to find command. Try using "find /data/bus/pharma/shared/purch/availability_alert/proc -newer start \! -newer stop" . – Sas Dec 12 '14 at 17:01

1 Answers1

1
awk '$6FS$7 >= "2014-12-11 09:00" && $6FS$7 <= "2014-12-12 09:00"'

Can I do string comparison in awk?

Community
  • 1
  • 1
Zombo
  • 1
  • 62
  • 391
  • 407