data loss while sending from fluentd to aws kinesis firehose

Question

We are using fluentd to send logs to aws kinesis firehose. We can see few records not sent to aws kinesis firehose every now and then. Here is our settings in fluentd.

    <system>
          log_level info
      </system>
      <source>
          @type tail
          path "/var/log/app/tracy.log*"
          pos_file "/var/tmp/tracy.log.pos"
          pos_file_compaction_interval 72h
          @log_level "error"
          tag "tracylog"
          <parse>
                @type "json"
                time_key False
          </parse>
      </source>
      <source>
         @type monitor_agent
         bind 127.0.0.1
         port 24220
      </source>
      <match tracylog>
          @type "kinesis_firehose"
          region "${awsRegion}"
          delivery_stream_name "${delivery_stream_name}"
          <instance_profile_credentials>
          </instance_profile_credentials>
          <buffer>
              # Frequency of ingestion
              flush_interval 30s
              flush_thread_count 4
              chunk_limit_size 1m
          </buffer>
      </match>

Have you checked the creation time of those missing records? I am also facing a similar issue , if I have concurrent data points , only 1 of them is delivered to the destination. Is this your case also? — Lina, Sep 23 '21 at 12:49
hi Lina, I have fixed this issue by updating my config file. there were two root cause in my case 1 . flush interval is very high and chunk size is small so lot of chunks in queue to get flush 2. my application startup time is very quick compare to fluentd process so whenever scaling event get triggers few initial record got missed . — Dhirendra Rawal, Oct 01 '21 at 11:40

score 0 · Answer 1 · edited Oct 01 '21 at 16:37

A few changes in the config fixed my issue:

  <system>
      log_level info
  </system>
  <source>
      @type tail
      path "/var/log/app/tracy.log*"
      pos_file "/var/tmp/tracy.log.pos"
      pos_file_compaction_interval 72h
      read_from_head true
      follow_inodes true
      @log_level "error"
      tag "tracylog"
      <parse>
            @type "json"
            time_key False
      </parse>
  </source>
  <source>
     @type monitor_agent
     bind 127.0.0.1
     port 24220
  </source>
  <match tracylog>
      @type "kinesis_firehose"
      region "${awsRegion}"
      delivery_stream_name "${delivery_stream_name}"

      <instance_profile_credentials>
      </instance_profile_credentials>
      <buffer>
        flush_interval 2
        flush_thread_interval 0.1
        flush_thread_burst_interval 0.01
        flush_thread_count 8
      </buffer>

data loss while sending from fluentd to aws kinesis firehose

1 Answers1