1

I use Loki with promtail in Docker Swarm to get the logs from containers on 3 hosts. Promtail is in global mode. After deploying a stack file, logs from all running services are in Grafana, but after some time (several days) some part of container logs has disappeared. There were some Internet issues, though all services have restarted, not all container logs have appeared.

docker-stack.yml

  loki:
    image: grafana/loki:latest
    logging:
      driver: json-file
      options:
        tag: "docker/loki"
    volumes:
      - ./loki/loki-config.yaml:/etc/loki/loki-config.yaml
      - loki:/data/loki
    command: -config.file=/etc/loki/loki-config.yaml
    networks:
      - monitor-net
      - traefik
    deploy:
      placement:
        constraints:
          - node.role==manager
      labels:
        - "traefik.enable=true"
        - traefik.docker.network=default_traefik
        - traefik.http.routers.loki-http.rule=Host(`swarm.loki`)
        - traefik.http.routers.loki-http.entrypoints=http
        - traefik.http.routers.loki-http.middlewares=https-redirect

        - traefik.http.routers.loki-https.rule=Host(`swarm.loki`)
        - traefik.http.routers.loki-https.entrypoints=https
        - traefik.http.routers.loki-https.tls=true
        - traefik.http.routers.loki-https.tls.certresolver=le
        - traefik.http.services.loki.loadbalancer.server.port=3100
      restart_policy:
        condition: on-failure

  promtail:
    image: grafana/promtail:latest
    volumes:
      - /var/log:/var/log
      - /var/lib/docker/containers:/var/lib/docker/containers
      - ./promtail:/etc/promtail-config/
    command: -config.file=/etc/promtail-config/promtail-config.yaml
    networks:
      - traefik
      - monitor-net
    logging:
      driver: json-file
      options:
        tag: "docker/promtail"
    deploy:
      mode: global

promtail-config.yaml

server:
  http_listen_port: 3100
  grpc_listen_port: 0
positions:
  filename: /tmp/positions.yaml
client:
  url: http://loki:3100/api/prom/push

scrape_configs:
  - job_name: system
      static_configs:
        - targets:
            - 192.168.56.103 # on each host its ip is written
          labels:
            job: varlogs
            __path__: /var/log/*log

  - job_name: containers
    static_configs:
      - targets:
          - 192.168.56.103
      - labels:
          job: containerlogs
          hostname: vm2
          __path__: /var/lib/docker/containers/*/*log

    pipeline_stages:

      - json:
          expressions:
            stream: stream
            attrs: attrs
            tag: attrs.tag
            hostname: hostname
      - labels:
          tag:
          hostname:
          stream:

loki-config.yaml

auth_enabled: false

server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2022-02-05
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h
limits_config:
  ingestion_rate_mb: 15
  ingestion_burst_size_mb: 20

What is the problem and what could be the solution?

Thanks in advance.

mmynich
  • 11
  • 2
  • loki should be the component that grafana is querying and promtail is logging to. What is the retention (and persistence) you have in loki? – Chris Becke Feb 23 '22 at 11:53
  • Please provide enough code so others can better understand or reproduce the problem. – Community Feb 23 '22 at 13:04
  • @ChrisBecke I added my configs to post. But the retention time is not set. Do you think the problem is hidden there? – mmynich Feb 23 '22 at 13:13
  • I cant say for sure. Loki has lots of indexes and caches and storage options. But, perusing the config, data is going to go missing between 24h and 168 hours if you don't explicitly control it. – Chris Becke Feb 23 '22 at 13:38

0 Answers0