FluentD elasticsearch Plugin @type elasticsearch_data_stream with dynamic datastreams

Question

update-1: I have made some progress on this, by defining a concrete data_stream_name in match. The only thing left is to figure out a way to do dynamic data stream. I am updating the code sample below and marking what I added

I have a EFK stack and I want to use datastreams to rollover indexes. When I use the configuration below in output plugin I get an error The client is unable to verify that the server is Elasticsearch. Some functionality may not be compatible if the server is running an unsupported product. 2022-07-27 17:02:40 +0000 [warn]: #0 failed to flush the buffer. retry_times=90 next_retry_time=2022-07-27 17:03:12 +0000 chunk="5e4cbcab11eb8e7fd1b93d4aa706fb67" error_class=Fluent::ConfigError error="Failed to create data stream: <logs-abc-def-2022.07.27> Connection refused - connect(2) for 127.0.0.1:9200 (Errno::ECONNREFUSED)" 2022-07-27 17:02:40 +0000 [warn]: #0 suppressed same stacktrace

update-1: The above error is resolved, I am also changing the title of the question to reflect the new requirement. Is there a way to do dynamic data streams, doing it by namespace? if I use data_stream_name logs-${$.kubernetes.namespace_name} i get the above error but if i use a concrete name it works logs-all-namespaces

<match <pattern1> <pattern2> >
    @type elasticsearch_data_stream
    @log_level info
    prefer_oj_serializer true
    log_es_400_reason true
    include_tag_key true
    tag_key tag_fluentd
    hosts "#{ENV['ELASTICSEARCH_HOSTS']}"
    user "#{ENV['ELASTICSEARCH_USERNAME']}"
    password "#{ENV['ELASTICSEARCH_PASSWORD']}"
    scheme "https"
    ssl_version "TLSv1_2"
    ssl_verify false
    reload_connections false
    reconnect_on_error true
    reload_on_failure true
    request_timeout 15s
    logstash_format false
    # logstash_prefix logs-${$.kubernetes.namespace_name}
    time_key time_docker_log
    include_timestamp true
    suppress_type_name true
    template_name "hot-warm-cold-delete-30d"
    ilm_policy_id "hot-warm-cold-delete-30d"
    data_stream_name logs-all-namespaces # update 1: changed from logs-${$.kubernetes.namespace_name} to logs-all-namespaces
    enable_ilm true
    # ilm_policy_overwrite false
    # template_overrite true
    # template_pattern logs-${$.kubernetes.namespace_name}-*
    # index_name logs-${$.kubernetes.namespace_name}

    <buffer time, tag, $.kubernetes.namespace_name>
        @type file
        timekey 10
        path /data/fluentd-buffers/kubernetes.system.buffer.es
        ## Retrying control
        retry_type exponential_backoff # Specifies how to wait for the next retry to flush buffer. Default
        retry_forever true # Plugin will ignore retry_timeout and retry_max_times options and retry flushing forever.
        retry_max_interval 30 # The maximum interval (seconds) for exponential backoff between retries while failing.
        total_limit_size 512M # The size limitation of this buffer plugin instance. Default 512M
        ## buffering params
        chunk_limit_size 64M # The max size of each chunks: events will be written into chunks until
                              #the size of chunks become this size. Default 8MB
        chunk_limit_records 5000 # The max number of events that each chunks can store in it
        chunk_full_threshold 0.85 # The percentage of chunk size threshold for flushing
                                  # output plugin will flush the chunk when actual size reaches
        # Total size of the buffer (8MiB/chunk * 32 chunk) = 256Mi
        # queue_limit_length 32
        ## flushing params
        flush_thread_count 8 # The number of threads to flush the buffer. Default 1
        flush_interval 5s # The interval between buffer chunk flushes. Default 60
        flush_mode interval # Flushes per flush interval 
        overflow_action block # This mode stops input plugin thread until buffer full issue is resolved
    </buffer>
</match>

# Send pattern3 logs to rabbitmq
<match <pattern3>>
    @type rabbitmq
    host "#{ENV['RABBITMQ_HOST']}"
    user "#{ENV['RABBITMQ_WRITER_USERNAME']}"
    pass "#{ENV['RABBITMQ_WRITER_PASSWORD']}"
    vhost /
    format json
    exchange raw
    exchange_type direct
    exchange_durable true
    routing_key raw
    timestamp true
    heartbeat 10
    <buffer time, tag, $.kubernetes.namespace_name>
        @type file
        timekey 10
        path /data/fluentd-buffers/kubernetes.system.buffer.rabbitmq
        flush_mode interval
        retry_type exponential_backoff
        flush_thread_count 4
        flush_interval 5s
        retry_forever
        retry_max_interval 30
        chunk_limit_size 16M
        total_limit_size 512M
        chunk_full_threshold 0.85
        overflow_action block
    </buffer>
</match>

gem list output(only including relevant libraries)

elastic-transport (8.0.0)
elasticsearch (8.2.0)
elasticsearch-api (8.2.0)
fluent-plugin-elasticsearch (5.2.2)

index template (hot-warm-cold-delete-30d)

{
  "template": {
    "settings": {
      "index": {
        "lifecycle": {
          "name": "hot-warm-cold-delete-30d"
        },
        "routing": {
          "allocation": {
            "include": {
              "_tier_preference": "data_hot"
            }
          }
        }
      }
    },
    "aliases": {},
    "mappings": {}
  }
}

ilm

{
  "policy": {
    "phases": {
      "hot": {
        "min_age": "0ms",
        "actions": {
          "rollover": {
            "max_size": "5gb",
            "max_age": "10m"
          },
          "set_priority": {
            "priority": 100
          }
        }
      },
      "warm": {
        "min_age": "10m",
        "actions": {
          "set_priority": {
            "priority": 50
          }
        }
      },
      "cold": {
        "min_age": "2d",
        "actions": {
          "set_priority": {
            "priority": 0
          }
        }
      },
      "delete": {
        "min_age": "365d",
        "actions": {
          "delete": {
            "delete_searchable_snapshot": true
          }
        }
      }
    }
  }
}

Please let me know if any more information needed. I have been on this for a week now.

score 2 · Answer 1 · answered Aug 03 '22 at 13:11

Regarding the dynamic placeholders, have you tried the solution mentioned in the issue with dynamic datastream name for the fluentd-es plugin?

For your use case, that would basically be:

<filter **>
  @type record_transformer
  enable_ruby
  <record>
    kuber_namespace ${record["kubernetes"]["namespace_name"]}
  </record>
</filter>
<match <pattern1> <pattern2> >
  @type elasticsearch_data_stream
  data_stream_name logs-${kuber_namespace}
  ... 
  <buffer tag, kuber_namespace>
    .... 
  </buffer>
</match>

even if i do this, the template won't rollover, as template requires a rollover alias, which is different for each index — Rishi, Aug 03 '22 at 17:23

FluentD elasticsearch Plugin @type elasticsearch_data_stream with dynamic datastreams

1 Answers1