1

I have a very big file in my Linux server(currently more than 10GB) and the content is keep on adding to the end of the file .

I have another script that needs to process this file about 1000 lines at a time . What is the best way to get the lines from this file and operate on the same .

I am thinking about sed command to cut out the lines. But is that the best approach ?

Mathews Jose
  • 399
  • 6
  • 18
  • You may want to take a look at this related question: http://stackoverflow.com/questions/42396561/monitor-a-log-file-using-tail-f/42398092#42398092 – codeforester Feb 23 '17 at 08:30
  • @codeforester thanks for the link .But I am not sure how to implement this for a huge file using shell script – Mathews Jose Feb 23 '17 at 09:13
  • That's exactly my point. You are far better using a more advanced language like Ruby, Python, or Perl for it. I have solved such problems with Perl / C in the past. Bash is definitely not the choice. – codeforester Feb 23 '17 at 09:16

2 Answers2

0

Since the file to be monitored is a plain text file (not binary), you could do a

tail -f my_big_fat_file | my_fancy_processing_script

You don't get it in 1000-line-chunks, but your processing script can accumulate the lines and start processing until it collected enough of them.

user1934428
  • 19,864
  • 7
  • 42
  • 87
  • This seems not working . I am getting the command got exited immediately – Mathews Jose Feb 23 '17 at 07:54
  • There are two possibilities that this commands exits immediately (and in each case you will get an error message): (1) The file does not exist yet at the time you invoke the command, or (2) Your script closes STDIN. I don't see how this command could terminate siltently. – user1934428 Feb 23 '17 at 13:37
0
tail -n 1 -f growingfile.log | while read line; do
   echo $line
done

tail -n 1 -f <file> will print out the last line in the file and will keep printing out each new line appended to the file. That output is piped to while read line that will be able to process each new line written to the file

Bruno Negrão Zica
  • 764
  • 2
  • 7
  • 16