0

Personal knowlegedment: I read from javacodegeeks: "... SimpleAsyncTaskExecutor is ok for toy projects but for anything larger than that it’s a bit risky since it does not limit concurrent threads and does not reuse threads. So to be safe, we will also add a task executor bean... " and from baeldung a very simple example how to add our own Task Executor. But I can find any guidance explaining what are the consequences and some worth cases to apply it.

Personal desire: I am working hard to provide a corporative architecture for our microservices logs be publish on Kafka topics. It seems reasonble the statement " risky caused by not limit concurrent threads and not reuse it" mainly for my case that is based on logs.

I am running the bellow code succesfully in local desktop but I am wondering if I am providing a custom Task Executor properly.

My question: does this configuration bellow taking in account I am already using kafkatempla (i.e. syncronized, singleton and thread safe by default at least for producing/sending messsage as far as understand it) really going in right direction to reuse threads and avoid spread accidentally threads creation while using SimpleAsyncTaskExecutor?

Producer config

@EnableAsync
@Configuration
public class KafkaProducerConfig {

    private static final Logger LOGGER = LoggerFactory.getLogger(KafkaProducerConfig.class);

    @Value("${kafka.brokers}")
    private String servers;

    @Bean
    public Executor taskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(2);
        executor.setMaxPoolSize(2);
        executor.setQueueCapacity(500);
        executor.setThreadNamePrefix("KafkaMsgExecutor-");
        executor.initialize();
        return executor;
    }

    @Bean
    public Map<String, Object> producerConfigs() {
        Map<String, Object> props = new HashMap<>();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, servers);
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, JsonDeserializer.class);
        return props;
    }

}

Producer

@Service
public class Producer {

    private static final Logger LOGGER = LoggerFactory.getLogger(Producer.class);

    @Autowired
    private KafkaTemplate<String, String> kafkaTemplate;

    @Async
    public void send(String topic, String message) {
        ListenableFuture<SendResult<String, String>> future = kafkaTemplate.send(topic, message);
        future.addCallback(new ListenableFutureCallback<SendResult<String, String>>() {

            @Override
            public void onSuccess(final SendResult<String, String> message) {
                LOGGER.info("sent message= " + message + " with offset= " + message.getRecordMetadata().offset());
            }

            @Override
            public void onFailure(final Throwable throwable) {
                LOGGER.error("unable to send message= " + message, throwable);
            }
        });
    }
}

for demo purposes:

@SpringBootApplication
public class KafkaDemoApplication  implements CommandLineRunner {

    public static void main(String[] args) {
        SpringApplication.run(KafkaDemoApplication.class, args);

    }

    @Autowired
    private Producer p;

    @Override
    public void run(String... strings) throws Exception {
        p.send("test", " qualquer messagem demonstrativa");
    }

}
Jim C
  • 3,957
  • 25
  • 85
  • 162

1 Answers1

9

This is default implementation of SimpleAsyncTaskExecutor

protected void doExecute(Runnable task) {
    Thread thread = (this.threadFactory != null ? this.threadFactory.newThread(task) : createThread(task));
    thread.start();
}

New thread is created for every task, thread creation in Java is not cheap: (Reference)

Thread objects use a significant amount of memory, and in a large-scale application, allocating and deallocating many thread objects creates a significant memory management overhead.

=> Repeatedly execute task with this task executor will negatively affect application performance (moreover, this executor by default does not limit the number of concurrent tasks)

That's why you're advised to use a thread pool implementation, the thread creation overhead is still there but significantly reduced due to threads are reused instead of create-fire-forget.

When configure ThreadPoolTaskExecutor, two notable parameters should be defined properly according to your application load:

  1. private int maxPoolSize = Integer.MAX_VALUE;

    This is maximum number of threads in the pool.

  2. private int queueCapacity = Integer.MAX_VALUE;

    This is maximum number of tasks queued. Default value may cause OutOfMemory exception when the queue is full.

Using default value (Integer.MAX_VALUE) may lead to out of resource / crash in your server.

You can improve the thoughput by increasing number of maximum poolsize setMaxPoolSize(), to reduce the warm up when loading increase, set core poolsize to higher value setCorePoolSize() (any number of threads different between maxPoolSize - corePoolSize will be initilized when load increase)

Mạnh Quyết Nguyễn
  • 17,677
  • 1
  • 23
  • 51
  • Am I in right direction with above "@Bean public Executor taskExecutor()..."? It is very worth your comments about maxPoolSize and queueCapacity. But, do you miss some extra configuration running your eyes above? Do you see any weird concepts or conflicts in my above statement: "I am already using kafkatemplate (i.e. syncronized, singleton and thread safe by default at least for producing/sending messsage"? In my company we have today 50 milion customer using our mobile abpplication wich consumes several microservices and we expceted 150 requests per second demanding each call be loggged – Jim C Feb 27 '20 at 14:12
  • 1
    The executor pool is not related to kafkatemplate anyways. You already use async version of kafkatemplate (non-blocking) so performance should be fast, 150 RPS is small. Look at this [result](https://stackoverflow.com/questions/50060086/spring-kafka-template-producer-performance) – Mạnh Quyết Nguyễn Feb 28 '20 at 04:49
  • Thanks for your feedback that 150 request per second is small. P/lease just assume another number you would consider high. Regard "executor pool is not related to kafkatemplate anyways" I didn't get your point. kafkatemplate defenitly depends on executor, doesn't it? I understand that kafkatemplate async version is fast but Manh's comment does make sense to me "allocating and deallocating many thread objects creates a significant memory management overhead... this executor by default does not limit the number of concurrent tasks". It seems you see I am in wrong direction. Can you be clearer? – Jim C Feb 28 '20 at 14:05
  • According to https://stackoverflow.com/a/48145863/4148175 "the implement of producer is async. Message is stored in an internal queue to wait to send by inner thread, which would improve efficiency with potential batching". Isn't such inner thread the one provided by executor? If so, having a pool I believe it will help in some negative scenarios (eg. lower connection between Microsevice container and Kafka container, some unexpected boom in messages and so on. Am I right when I said "such inner thread is exactly the one provided by executor poool"? – Jim C Feb 28 '20 at 14:11