0

I am working on analyzing data of a log file on frequent intervals and process accordingly. The log file which is the input, is an infinitely growing file. A long running process writes to it and it belongs to the root user.

I have all the file permissions for the log file. What I want to do is to move only the file contents until that point(take the file contents and clear the file) without disturbing the another process. Preferably through a python script.

[EDIT] (i.e)., I need to cut & paste all the contents from the log file(primary) until that point of time and put them into another(secondary) log file. I will use this secondary log file for my data analysis. In the mean time, if the long running process writes anything to the primary log file, it should not be lost. It will not be a problem, if I take the new data to the secondary log file along with the other contents.

[EDIT 2] The main problem I face is to clear the file contents once they are fetched from the primary log file. I need to ensure that any log written to the file will not be lost while I read from the primary log & write them in the secondary log and remove those contents from the file.

I looked into the TimedRotatingFileHandler but it doesn't help me in this regard. Any other suggestions?

Thanks

thiruvenkadam
  • 4,170
  • 4
  • 27
  • 26
  • Can you be more clear about the second paragraph. – Vivek Sep 12 '13 at 13:10
  • I made an update to explain better :-) Any more clarifications? – thiruvenkadam Sep 12 '13 at 13:47
  • Why don't you pipe the output of one log-file to other log-file. This is a simple linux command. Then use this second log-file for any further processing. – Vivek Sep 12 '13 at 15:07
  • You mean to take the contents of the log file? In this case, I must clear the contents that are fetched from the primary file so that they won't be there for data analysis in the future. Could it be done with the shell script? – thiruvenkadam Sep 13 '13 at 09:05

1 Answers1

0

The linux way to tail a file is simple. Use this command on your log file as soon as the logging process starts:

  tail -f log_file_name.log >> /tmp/new_file_name.log &


[EDIT] tail -f log_file_name.log >> /tmp/new_file_name.log | tail -f /tmp/new_file_name.log | xargs -I TailOutput echo sed -i '/TailOutput/d' log_file_name.log

Then you can use this new_file_name.log to do whatever you want to do with this new file. Also your original log file is intact. I understand this is getting little twisted, but that's the way I can think now!!!

Vivek
  • 910
  • 2
  • 9
  • 26
  • But this doesn't clear the data from my primary log file. This is more like using '''cat log_file_name.log > /tmp/new_file_name.log'''. I need something like '''cat log_file_name.log > /tmp/new_file_name.log && cat /dev/null > log_file_name.log''', except that logs written after the first cat command are not lost. – thiruvenkadam Sep 13 '13 at 09:55
  • Even though it seems twisted, it works for me :-) And that's the thing which counts :-) – thiruvenkadam Sep 16 '13 at 07:42