0

Aim

To read all the logs from apache server and store on s3

Background

We have following statement in the httpd.conf

ErrorLog "| /usr/bin/tee -a /var/log/httpd/error_log | /usr/bin/java -cp /usr/local/bin/CustomProducer/producer-1.0-SNAPSHOT-jar-with-dependencies.jar stdin.producer.StdInProducer /usr/local/bin/CustomProducer/Config.json >> /var/log/producer_init.log 2>&1"

This puts the log in error_log file as well as on std out to be consumed by a java producer for Apache kafka

This producer eventually sends the data to kafka cluster and then amazon S3.

The error_log file gets rotated and then also stored on S3 using logrotate

Producer Code

this.stdinReader = new BufferedReader(new InputStreamReader(System.in));
try {
         while ((msg = this.stdinReader.readLine()) != null) {
               //Some processing which may introduce some delay
               //Send message to cluster
                this.producer.send(message); 
         }    
    }

Problem

When hourly logs are compared from kafka bucket and logrotate bucket some logs are intermittently missing without specific pattern or time.

Is it likely due to pipe limit or BufferedReader limit ? What is the way to find this out ?

Albatross
  • 669
  • 7
  • 24
  • Just an idea. When it comes to a normal linux pipe, the slowest pipe consumer usually blocks producer's output, which might not be the case with Apache logging subsystem as a whole and/or ErrorLog directive in particular. – andbi Mar 16 '17 at 00:04

1 Answers1

0

No. Not even slightly. The Reader is exactly as reliable as the underlying pipe or socket. If it's TCP it can't lose data without resetting the connection.

user207421
  • 305,947
  • 44
  • 307
  • 483
  • So what happens if there is some delay in sending the message within the loop before it can read next message. Will it have enough buffer to contain those messages ? – Albatross Mar 15 '17 at 23:56
  • 1
    TCP has flow control. Ultimately the sender will stall, or be told to try again, depending on how it is written. – user207421 Mar 16 '17 at 00:00