3

I'm currently evaluating Loki and facing issues with running out of disk space due to the amount of chunks.

My instance is running in Docker containers using a docker-compose setup (Loki, Promtail, Grafana) from the official documentation (see docker-compose.yml below).

I'm more or less using the default configuration of Loki and Promtail. Except for some tweaks for the retention period (I need 3 months) plus a higher ingestion rate and ingestion burst size (see configs below).

I bind-mounted a volume containing 1TB of log files (MS Exchange logs) and set up a job in promtail using only one label.

The resulting chunks are constantly eating up disk space and I had to expand the VM disk incrementally up to 1TB.

Currently, I have 0.9 TB of chunks. Shouldn't this be far less? (Like 25% of initial log size?). Over the last weekend, I stopped the Promtail container to prevent running out of disk space. Today I started Promtail again and get the following warning.

level=warn ts=2022-01-24T08:54:57.763739304Z caller=client.go:349 component=client host=loki:3100 msg="error sending batch, will retry" status=429 error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded (limit: 12582912 bytes/sec) while attempting to ingest '2774' lines totaling '1048373' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"

I had this warning beforehand and increasing ingestion_rate_mb to 12and ingestion_burst_size_mb to 24 fixed this...

Kind of at a dead-end here.

Docker Compose

version: "3"

networks:
  loki:

services:

  loki:
    image: grafana/loki:2.4.1
    container_name: loki
    restart: always
    ports:
      - "3100:3100"
    command: -config.file=/etc/loki/local-config.yaml
    volumes:
      - ${DATADIR}/loki/etc:/etc/loki:rw
    networks:
      - loki

  promtail:
    image: grafana/promtail:2.4.1
    container_name: promtail
    restart: always
    volumes:
      - /var/log/exchange:/var/log
      - ${DATADIR}/promtail/etc:/etc/promtail
    ports:
      - "1514:1514" # for syslog-ng
      - "9080:9080" # for http web interface
    command: -config.file=/etc/promtail/config.yml
    networks:
      - loki

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    restart: always
    volumes:
      - grafana_var:/var/lib/grafana
    ports:
      - "3000:3000"
    networks:
      - loki

volumes:
  grafana_var:

Loki Config:


server:
  http_listen_port: 3100

common:
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    instance_addr: 127.0.0.1
    kvstore:
      store: inmemory

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: filesystem
      schema: v11
      index:
        prefix: index_
        period: 24h

ruler:
  alertmanager_url: http://localhost:9093

# https://grafana.com/docs/loki/latest/configuration/#limits_config
limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  ingestion_rate_mb: 12
  ingestion_burst_size_mb: 24
  per_stream_rate_limit: 12MB
chunk_store_config:
  max_look_back_period: 336h
table_manager:
  retention_deletes_enabled: true
  retention_period: 2190h
ingester:
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_encoding: snappy

Promtail Config

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
- job_name: exchange
  static_configs:
  - targets:
      - localhost
    labels:
      job: exchangelog
      __path__: /var/log/*/*/*log
ouflak
  • 2,458
  • 10
  • 44
  • 49
Selwyn Rogers
  • 101
  • 2
  • 5

2 Answers2

1

Issue was solved. Logs were stored on ZFS with compression enabled and were thus listed much smaller on the file system. Chunk size was actually accurate. My bad.

Selwyn Rogers
  • 101
  • 2
  • 5
  • 1
    also, I see `boltdb-shipper` but no [compactor](https://grafana.com/docs/loki/latest/operations/storage/boltdb-shipper/#compactor) (Please note that I am not Loki setup authority, but this looks to me like oversight when dealing with file size... ) – Jan 'splite' K. May 31 '22 at 11:20
0

Grafana Loki creates a chunk file per each log stream per each 2 hours - see this article and this post at HackerNews. This means that the number of files is proportional to the number of log streams and to the data retention. The number of log streams is proportional to the number of unique sets of log fields (except message and timestamp fields). High number of chunks may point either to high number of log streams or to logs scattered over long retention. The solution is to either reduce the number of unique log streams (by removing high-cardinality labels with big number of unique values) or to reduce the data retention.

valyala
  • 11,669
  • 1
  • 59
  • 62