6

I am trying to publish number of events my App is processing. This is the code I have at the receiving end:

public void process(List<String> batch) {
    logger.info ("Processing batch of size " + batch.size())
    metrics.incrementCounter(MetricsType.CONSUMER_TOTAL_PROCESSED, batch.size)
}

The class Metrics is:

public class Metrics {
    private static final Map<MetricsType, Counter> COUNTER_MAP = new ConcurrentHashMap<>();

    public Metrics(
        @Autowired MeterRegistry meterRegistry
    ) {
        COUNTER_MAP.put(
            MetricsType.CONSUMER_TOTAL_PROCESSED,
            Counter.builder("CONSUMER_TOTAL_PROCESSED").register(meterRegistry)
        );

        COUNTER_MAP.put(
            MetricsType.CONSUMER_DUPLICATE_PROCESSED,
            Counter.builder("CONSUMER_DUPLICATE_PROCESSED").register(meterRegistry)
        );
    }

    public void increment(MetricsType metricsType, int size) {
        COUNTER_MAP.get(metricsType).increment(size);
    }
}

The enum MetricsType contains all type of counters.

The process method is be invoked by 16 threads at any time. The issue I am facing is the logger which prints the count and the total count reported in grafana are way off.

Do I have to synchonize everytime I am incrementing the counter?


Edit - What I mean by the counts are off is, if there are two logs with size 200, then grafana should report total counter 400. I am validating this by taking a time range of 2 hours, I extract all the sizes from logs and add them.


If you stumble upon this and see a difference between two sources please check what is your maximum number of data points in Grafana, that was the actual issue why I thought counters may not be threadsafe.

NewUser
  • 3,729
  • 10
  • 57
  • 79

3 Answers3

8

The counters are thread safe, internally they use some atomic data structure for counting. Given your code I would suspect that what the logger prints is quite different from what you see in grafana because the logger prints the size of the current batch while the counter prints the sum of all batch sizes.

Jan Thomä
  • 13,296
  • 6
  • 55
  • 83
  • So what does Grafana show? Less or more? How often do you scrape data from your app? Could it be a timing issue? – Jan Thomä Mar 12 '20 at 12:11
  • Can you please add a reference to your claim about thread-safety? And even if you copy-paste actual source code, this would be an implementation detail and anyone programming against this version of the source code how it is today is making a bold assumption. We need a reference to actual docs, preferably JavaDoc. – Martin Andersson Mar 16 '20 at 16:20
  • Well I actually just checked the implementation. You are right that the Javadoc does not give such a guarantee. Then again I would argue that a metrics library which is not thread-safe would not be very useful. – Jan Thomä Mar 16 '20 at 20:45
  • I understand your point, but I would argue any library without documentation is not very useful. Usually if someone can't document and then write clean code then he probably started writing spaghetti code and discovers at some point in time in the future that the code is such a mess it can't even be documented. So not having proper docs is usually a tell-tale sign of what kind of "quality" you pull into your own project. Thread-safety is SUPER important - especially for a metrics library. That this is not documented anywhere is very sad. They need to take a look at their priorities. – Martin Andersson Jun 18 '20 at 14:06
  • Actually I just came back to this problem again. Because now I am working with dynamic meters that needs to be removed. So I need to know what memory visibility guarantees do I have, I clearly don't want to remove a meter before it has been scraped, and so on. Again, sad. – Martin Andersson Jun 18 '20 at 14:07
  • Well I understand your concerns. It's not like you have a lot of options here though, so sometimes you will just need to make do with what you have. Then again it's an open source project so if you have the resources and inclination you could send them some pull request and improve the package. – Jan Thomä Jun 22 '20 at 11:50
0

Counters are monotonically incrementing. So if a previous batch has occurred, then a new batch would add to the previous total.

What back end are you using? If you are using Prometheus you might wrap your query in an increase function so you can visualize how much the counter is increasing.

(To answer the original question though: Yes, counters are thread safe)

checketts
  • 14,167
  • 10
  • 53
  • 82
0

There are two levels of concurrency

  1. concurrent access to the meter registry that holds the counter reference. This is thread safe
    private final Map<Meter.Id, Meter> meterMap = new ConcurrentHashMap();
  1. concurrent access to the counter itself. Since in micrometer, the backing counter is from the registered framework (e.g. say netflix atlas). One would "safely" assume counter implementation would be thread safe because it would otherwise be useless. But for sake of argument, we can see that one such exampe, the SimpleMeterRegistry implementation of a cumulative counter provided by micrometer.io is thread-safe, as it uses JDK DoubleAdder as its backing implementation
public class CumulativeCounter extends AbstractMeter implements Counter {
    private final DoubleAdder value = new DoubleAdder();

And again, the answer depends on the registry, even though we can safely assume all published registry guarantees counter-level thread-safely. Or, to get a definite answer, we need to a definite question

Are counters from Netflix Atlas registry thread safe in Micrometer
ROTOGG
  • 1,106
  • 1
  • 7
  • 17
  • Not pointless to have a non thread safe. You could have a threading model where a single thread and an event loop is used, and thus synchronization is not needed and yet consumes most of the performance gains on this model. – Juan Carrey Jun 02 '23 at 06:31