4

I have web server (Ubuntu) with Nginx + PHP.
It has Filebeat, which sends Nginx logs to Elastic ingestion node directly (no Logstash or anything else).
When I just installed it 1st time, I made some customizations to the pipeline, which Filebeat created. Everything worked great for a month or so.

But I noticed, that every Filebeat upgrade result in the creation of new pipeline. Currently I have these:

filebeat-7.3.1-nginx-error-pipeline: {},
filebeat-7.4.1-nginx-error-pipeline: {},
filebeat-7.2.0-nginx-access-default: {},
filebeat-7.3.2-nginx-error-pipeline: {},
filebeat-7.4.1-nginx-access-default: {},
filebeat-7.3.1-nginx-access-default: {},
filebeat-7.3.2-nginx-access-default: {},
filebeat-7.2.0-nginx-error-pipeline: {}

I can create new pipeline, but how do I tell (how to configure) Filebeat to use specific pipeline?

Here is what I tried and it doesn't work:

- module: nginx
  # Access logs
  access:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    var.paths: ["/var/log/nginx/*/*access.log"]

    # Convert the timestamp to UTC
    var.convert_timezone: true

    # The Ingest Node pipeline ID associated with this input. If this is set, it
    # overwrites the pipeline option from the Elasticsearch output.
    output.elasticsearch.pipeline: 'filebeat-nginx-access-default'
    pipeline: 'filebeat-nginx-access-default

It still using filebeat-7.4.1-nginx-error-pipeline pipeline.

Here is Filebeat instructions on how to configure it (but I can't make it work): https://github.com/elastic/beats/blob/7.4/filebeat/filebeat.reference.yml#L1129-L1130

Question: how can I configure Filebeat module to use specific pipeline?

Update (Nov 2019): I submitted related bug: https://github.com/elastic/beats/issues/14348

Slavik
  • 1,488
  • 1
  • 15
  • 24

3 Answers3

1

In beats source code, I found that the pipeline ID is settled by the following params:

  • beats version
  • module name
  • module's fileset name
  • pipeline filename

the source code snippet is as following:

// formatPipelineID generates the ID to be used for the pipeline ID in Elasticsearch
func formatPipelineID(module, fileset, path, beatVersion string) string {
    return fmt.Sprintf("filebeat-%s-%s-%s-%s", beatVersion, module, fileset, removeExt(filepath.Base(path)))
}

So you cannot assign the pipeline ID, which needs the support of elastic officially.

For now, the pipeline ID is changed along with the four params. You MUST change the pipeline ID in elasticsearch when you upgrading beats.

waston
  • 26
  • 2
  • thank you for source code reference. Can you clarify, what do you mean by "You MUST change the pipeline ID in elasticsearch"? From my experience - there is nothing needed to change on the elasticsearch side: all dashboards assumes, that pipelines suffixes are always changes, so they just keep working. So, at this point, I think it would be proper to say: "it's impossible to change pipeline ID in the filebeat config". – Slavik Nov 12 '19 at 08:07
  • I should say `should` rather than `MUST`. My mistake. During the process of filebeat launching, it will check all the modules' pipelines status first. If a pipeline ID is not existing, filebeat will create the pipeline, and the ID would be as `filebeat-%s-%s-%s-%s", beatVersion, module, fileset, removeExt(filepath.Base(path))`. So, when upgrading your filebeat version, the new pipeline ID will be created in es, and the number of pipelines is growing everytime you doing this. – waston Nov 21 '19 at 02:03
1

Refer /{filebeat-HOME}/module/nginx/access/manifest.yml, maybe u should set ingest_pipeline in /{filebeat-HOME}/modules.d/nginx.yml. the value seems like a local file.

annld
  • 69
  • 6
  • My similar workaround now is to edit `{filebeat-HOME}/module/nginx/access/ingest/default.json` and have `filebeat.overwrite_pipelines: true`. The problem with this: that file is overwritten anytime filebeat upgrades (`apt upgrade`) and need to be edited every time on upgrade. – Slavik Nov 20 '19 at 22:06
0

The pipeline can be configured either in your input or output configuration, not in the modules one.

So in your configuration you have different sections, the one you show in your question is for configuring the nginx module. You need to open filebeat.yml and look for the output section where you have configured elasticsearch and put the pipeline configuration there:

#-------------------------- Elasticsearch output ------------------------------
output.elasticsearch:
  # Array of hosts to connect to.
  hosts: ["elk.slavikf.com:9200"]
  pipeline: filebeat-nginx-access-default

If you need to be able to use different pipelines depending on the nature of data you can definitely do so using pipeline mappings:

output.elasticsearch:
  hosts: ["elk.slavikf.com:9200"]
  pipelines:
    - pipeline: "nginx_pipeline"
      when.contains:
        type: "nginx"
    - pipeline: "apache_pipeline"
      when.contains:
        type: "apache"
Val
  • 207,596
  • 13
  • 358
  • 360
  • I'll try that. But that is strange: you are suggesting to define pipeline globally for the non-global module. What if I have 2 modules? Doesn't each modules has it's own pipeline? – Slavik Nov 07 '19 at 07:03
  • If you need to be able to use different pipelines depending on the nature of data you can definitely do so using [pipeline mappings](https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html#pipelines-option-es) – Val Nov 07 '19 at 07:14
  • I tried to define global pipeline; I tried to use pipeline mappings. Same result: Filebeat still uses its own pipeline, not the one I defined. I attached debug logs to Github issue: https://github.com/elastic/beats/issues/14348 – Slavik Nov 07 '19 at 07:54
  • Note that `type: "nginx"` should be `service.type: "nginx"` according to the events I see int he debug log. – Val Nov 07 '19 at 08:14
  • Tried `service.type`. No difference – Slavik Nov 07 '19 at 10:03