2

I'm trying to import performance test results history to Prometheus and faced a strange issue with the official Python Prometheus client.

Such code works correctly:

dt_now = datetime.datetime.now(tz=pytz.timezone('UTC'))
gobj = GaugeMetricFamily('FooMetricGood', '')
gobj.add_metric([], 123, timestamp=dt_now.timestamp())
yield gobj

And such don't:

dt_format = '%Y-%m-%d_%H-%M-%S.%f %z'
dt_custom_str = '2021-11-11_18-12-59.000000 +0000'
dt_parsed_from_custom = datetime.datetime.strptime(dt_custom_str, dt_format)
gobj = GaugeMetricFamily('FooMetricNotWorking', '')
gobj.add_metric([], 789987, timestamp=dt_parsed_from_custom.timestamp())
yield gobj

I got warnings in Prometheus log like these:

prometheus-prometheus-1  | ts=2021-11-11T13:41:01.895Z caller=scrape.go:1563 level=warn component="scrape manager" scrape_pool=services target=http://192.168.64.1:8080/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=1

I tried to raise the issue in GitHub here with full working and not working code examples about 2 weeks ago but got absolutely no answers.

Any help will be very appreciated.

UPDATE : After answers and comments I've rechecked and here are some more details.

If I try two metrics from datetime now (see code example from GitHub link above) and datetime taken from string '2021-12-13_00-34-59.000000 +0000' all of them appear in Prometheus Python client web interface as:

# HELP FooMetricGood 
# TYPE FooMetricGood gauge
FooMetricGood 123.0 1639475119451
# HELP FooMetricGoodToo 
# TYPE FooMetricGoodToo gauge
FooMetricGoodToo 456.0 1639475119451
# HELP FooMetricNotWorkingNew 
# TYPE FooMetricNotWorkingNew gauge
FooMetricNotWorkingNew 789987.0 1639355699000

But in the log of the Prometheus server I see:

prometheus-prometheus-1  | ts=2021-12-14T09:51:35.524Z caller=scrape.go:1611 level=debug component="scrape manager" scrape_pool=services target=http://192.168.64.1:8080/metrics msg="Out of bounds metric" series=FooMetricNotWorkingNew
prometheus-prometheus-1  | ts=2021-12-14T09:51:35.524Z caller=scrape.go:1563 level=warn component="scrape manager" scrape_pool=services target=http://192.168.64.1:8080/metrics msg="Error on ingesting samples that are too old or are too far into the future" num_dropped=1

As far as I understand in Prometheus Python timestamps are in milliseconds so I've compared them

FooMetricGood 1639475119451
FooMetricNotWorkingNew 1639355699000

and got:

1639475119451 - 1639355699000 = 119420451 milliseconds = (119420451 / 1000 / 60 / 60) hours = 33.1723475

So according to Prometheus Python current time is only 33 hours after the bad metric timestamp.

I tried to tweak the date and make it 2021-12-14_08-34-59.000000 +0000, now difference is only 1.2913288888888887 hours before the present time but still not working.

Tamil Selvan
  • 1,600
  • 1
  • 9
  • 25
Dmitriy Vinokurov
  • 365
  • 1
  • 6
  • 28
  • did you check if the server time is up-to-date, if the `ntp` service is running? From what I read, it could be due to this, but then, I was not able to reproduce this issue at my end even after switching off `ntp` sync – Shod Dec 09 '21 at 13:52
  • We’re you able to determine what the issue was based on my answer and references below? – pygeek Dec 14 '21 at 05:28

2 Answers2

2

I was able to successfully execute your code and query the metric in prometheus: https://replit.com/@pygeek1/ComplicatedAcidicAxis#main.py

I suspect that there is an issue with the Prometheus server. Ensure server time is set correctly. A similar issue was reported over a 1 year ago, caused by an issue regarding server time constantly changing: https://github.com/prometheus/prometheus/issues/6554

pygeek
  • 7,356
  • 1
  • 20
  • 41
  • In your replit screen there is Prometheus client, it works fine for me too. I've got troubles with Prometheus server and added more details to question. – Dmitriy Vinokurov Dec 14 '21 at 09:56
  • 1
    Thank you for help and though your answer does not contain direct solution, it helped me to find right way to real cause of trouble. So I gave you bounty. – Dmitriy Vinokurov Dec 14 '21 at 11:34
2

Seems that I've found source of trouble.

I've found Prometheus server settings:

  scrape_interval: 1m
  scrape_timeout: 10s

and here is their description:

Every 5 minutes (scrape_interval) Prometheus will get the metrics from the given URL. It will try 30 seconds (scrape_timeout) to get the metrics if it can't scrape in this time it will time out.

So even though my metrics are only one hour old - it is still too old for the server.

Trouble now is that you could not set scrape_timeout more than scrape_interval. So for the one month old data seems that I will need to wait one month for the server to gather these metrics.

I'm tired of Prometheus and most sources that I've read says that it is very hard to import history to it, so I've switched to VictoriaMetrics and using same Prometheus Python client and by increasing VictoriaMetrics option -search.cacheTimestampOffset for example to 10080m0s (1 week) I've easily imported test data set to proove that it is possible.

Dmitriy Vinokurov
  • 365
  • 1
  • 6
  • 28