5

I'm wondering if tailf can generate blocking I/O which will slow down server responsiveness due to logging.

For ex. assuming the following setup:

Debian 5.1 linux server (foo) which is managed via terminal (foo is hosted on EC2).

Foo runs several applications, each writing to their own log file. For the sake of example, Apache httpd to /var/log/apache/access.log & Tomcat 5.5 to /var/log/tomcat5.5/myApp.log.

If I open an ssh connection to foo, (note: Internet link, high latency, relatively slow upload) and run tail -F /var/log/apache/access.log can't I reach a situation where the kernel blocks httpd's writes to this log file and thus slows down httpd's performance because of the wait enforced on each thread?

To give some numbers, let's assume that foo logs ~200kb of log data per second which needs to be pushed over the wire to the ssh client.

Another theoretical aspect: What would happen if the /var/log file system set on an infinite size ram (remember: Theoretically speaking) so that the hard disk seek time is eliminated ?

Third aspect, what would happen if I'd open the ssh connection from a really slow link (Let's assume that foo is traffic shaped to push only 5kb/s upload)?

Would love to hear what you think guys.

Thanks for reading, Maxim.

Maxim Veksler
  • 2,725
  • 10
  • 28
  • 32

3 Answers3

3

I don't think there will be some blocking over I/O here. When you do "tail -f", what's happening is

  1. your shell process, let's say bash, will spawn a new process 'tail'.
  2. tail will open the file, move file pointer to the end, wait for 3 seconds and check whether there are new data.
  3. if there is new data, tail will push it back to the bash using unix pipe.
  4. this data gets transmitted from server to your machine by bash + ssh.

So as you can see, a slow internet connection won't affect step #2, which is the key for I/O performance anyway.

Plus, tail opens file in 'read-only' mode and, an educated guess, logs are open in 'append-only' mode, so there shouldn't be much locking here to worry about. If this is still a bit concern for you, then you may want to try out the inotail which is based on latest linux inotify api to avoid polling the file.

Hope this helps, Alex

0

I don't think it is likely. I believe the writes will be cached in ram and since they were just written I imagine that tail will be reading those pages from ram as well. The pages will be periodically synced to disk. I would be surprised if Apache blocks on waiting for the logs to be written out to disk myself.

Kyle Brandt
  • 83,619
  • 74
  • 305
  • 448
0

The correct answer is that the process reading the log and the process writing the log are not related in any way. They are separate processes and not threads. They do not share memory and they have their own file handles with their own file handle pointers. Neither affects the other in any way. The kernel won't stop a write to a file because some other program is reading it. It will do things to speed it up (write caching to RAM when the disk is busy, sharing the cache with all file descriptors that use that file, etc), but not slow it down!

The other answer about how tail works is half right. It doesn't use a pipe to talk to bash. Bash is suspended waiting on tail to finish (unless run with &). Tail inherits the "stdout" file descriptor (connected to your terminal by default) from bash and writes to it directly. It would be inefficient to pipe it back to bash, do a task switch to bash to read the data, and the have bash write the output. Unix is designed to be simple and effective, stdin to stdout for most everything.

Current versions of GNU tail fully support the inotify API to avoid polling. This is normally not going to make much difference for tail. Its mainly so that file managers can update directories and servers know when to re-read a configuration file (without restarting the server). You can also have tail follow log rotations (normally it keeps its file descriptor). Another useful aside is "tac", which will reverse its input lines. This allows you to have the most recent information at the top when processing a log file for web display. Finally, ccze will colorize your log files for easier viewing (ANSI or HTML).

Evan Langlois
  • 179
  • 1
  • 4