1

I am trying to set up a kafka cluster on the Openshift platform using this guide: https://developers.redhat.com/blog/2018/10/29/how-to-run-kafka-on-openshift-the-enterprise-kubernetes-with-amq-streams/

I have my zookeeper and kafka clusters running as shown here: pods and when running my application as the bootstrap-servers I input the route to the my-cluster-kafka-external bootstrap. But when I try to send a message to Kafka i get this message:

21:32:40.548 [http-nio-8080-exec-1] ERROR o.s.k.s.LoggingProducerListener () - Exception thrown when sending a message with key='key' and payload='Event(id=null, number=30446C77213B40000004tgst15, itemId=, serialNumber=0,  locat...' to topic tag-topic:
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.

The topic was successfully created and the application runs fine when running with local kafka on my computer. So what am I doing wrong, why can't I access Kafka and send messages?

Here is my kafka producer config in spring-kafka:

    @Value("${kafka.bootstrap-servers}")
    private String bootstrapServers;    

    @Bean
    public Map<String, Object> producerConfigs() {
        Map<String, Object> props = new HashMap<>();

        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "........kafka.EventSerializer");
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);

        return props;
    }


    @Bean
    public ProducerFactory<String, Event> producerFactory() {
        return new DefaultKafkaProducerFactory<>(producerConfigs());
    }

EDIT: I set the logging level to debug and found this:

23:59:27.412 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Initialize connection to node my-cluster-kafka-bootstrap-kafka-test............... (id: -1 rack: null) for sending metadata request
23:59:27.412 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Initiating connection to node my-cluster-kafka-bootstrap-kafka-test............ (id: -1 rack: null)
23:59:28.010 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.n.Selector () - [Consumer clientId=consumer-1, groupId=id] Created socket with SO_RCVBUF = 65536, SO_SNDBUF = 131072, SO_TIMEOUT = 0 to node -1
23:59:28.010 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Completed connection to node -1. Fetching API versions.
23:59:28.010 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Initiating API versions fetch from node -1.
23:59:28.510 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.n.Selector () - [Consumer clientId=consumer-1, groupId=id] Connection with my-cluster-kafka-bootstrap-kafka-test........../52.215.40.40 disconnected
java.io.EOFException: null
    at org.apache.kafka.common.network.NetworkReceive.readFromReadableChannel(NetworkReceive.java:124) ~[kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:93) ~[kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:235) ~[kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:196) ~[kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:547) ~[kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:483) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.common.network.Selector.poll(Selector.java:412) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:258) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:230) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:221) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:153) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:228) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:205) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:284) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146) [kafka-clients-1.0.2.jar:?]
    at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1111) [kafka-clients-1.0.2.jar:?]
    at org.springframework.kafka.listener.KafkaMessageListenerContainer$ListenerConsumer.run(KafkaMessageListenerContainer.java:700) [spring-kafka-2.1.10.RELEASE.jar:2.1.10.RELEASE]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514) [?:?]
    at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264) [?:?]
    at java.util.concurrent.FutureTask.run(FutureTask.java) [?:?]
    at java.lang.Thread.run(Thread.java:844) [?:?]
23:59:28.510 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Node -1 disconnected.
23:59:28.510 [org.springframework.kafka.KafkaListenerEndpointContainer#0-0-C-1] DEBUG o.a.k.c.NetworkClient () - [Consumer clientId=consumer-1, groupId=id] Give up sending metadata request since no node is available
2

Can this have something to do with the connections.max.idle.ms property of the broker? Here someone had a similar problem.

I tried using kafka-console-producer by running this command:

bin\windows\kafka-console-producer --broker-list https://my-cluster-kafka-bootstrap-kafka-test.domain.com:443 --topic tag-topic --producer.config config/producer.properties

and with this configuration in the producer.properties:

compression.type=none
security.protocol=SSL
ssl.truststore.location=C:\\Tools\\kafka_2.12-2.2.0\\config\\store.jks
ssl.truststore.password=password
ssl.keystore.location=C:\\Tools\\kafka_2.12-2.2.0\\config\\store.jks
ssl.keystore.password=password
ssl.key.password=password

but I get a response saying that the connection was terminated while authenticating:

[2019-05-21 16:15:58,444] WARN [Producer clientId=console-producer] Connection to node 1 (my-cluster-kafka-1-kafka-test.domain.com/52.xxx.xx.40:443) terminated during authentication. This may happen due to any of the following reasons: (1) Authentication failed due to invalid credentials with brokers older than 1.0.0, (2) Firewall blocking Kafka TLS traffic (eg it may only allow HTTPS traffic), (3) Transient network issue. (org.apache.kafka.clients.NetworkClient)

Is there any way that the certificate from openshift is wrong?

Matt
  • 556
  • 8
  • 31

1 Answers1

0

The access through routes is possible via TLS only using the CA certificate generated by Strimzi that you have to extract as described in the article. Then you have to create a key store importing the certificate and providing that to the client application. I don't see such configuration in your producer.

ppatierno
  • 9,431
  • 1
  • 30
  • 45
  • Thank you for the answear. I generated the keystore.jks and added the following properties: `props.put("security.protocol", "SSL"); props.put("ssl.keystore.location", "app/src/main/resources/keystore.jks"); props.put("ssl.keystore.password", "password"); props.put("ssl.truststore.location", "app/src/main/resources/keystore.jks"); props.put("ssl.truststore.password", "password");` And now I get `Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target` Any ideas on this? – Matt May 14 '19 at 20:42
  • It sounds it is not able to find the certificate in the truststore or the truststore itself. Can you just try with an absolute path to the truststore just for testing? – ppatierno May 14 '19 at 20:54
  • Yes, the same error occurs with the absolute path given to the truststore – Matt May 14 '19 at 21:29
  • I don't think it's a Strimzi related problem but some missing configuration on the client side. You should try to use one of the kafka tools (kafka-console-consumer/producer) and setting up them with SSL access from your laptop to check that it's working or maybe validating that problems are elsewhere. We use routes everyday with no problems and we know a lot of developers doing the same. – ppatierno May 15 '19 at 09:39
  • Okay, so just for the sake of it I tried creating a completly new blank project with spring-kafka and one rest controller to try and connect to the route (with strimzi). I get a `java.io.EOFException: EOF during handshake, handshake status is NEED_UNWRAP` exception. Here is the code of the controller and part of the logs under it: https://pastebin.com/K7CY43YC Is there anything that comes to mind that still needs adjutsing for this to work? – Matt May 16 '19 at 12:33
  • Hi Matt, never heard about this kind of problem. As I mentioned we have to be sure that all is working fine in terms of certificates using the Kafka tools. Let's do that first and then we can dig into what could be the Spring problem. – ppatierno May 20 '19 at 07:40
  • Hi, thanks for still trying to help me. So i tried using the console-producer to produce messages to Kafka using SSL and I get a response tha tells me that the conncetion was terminated during authentication. Since the message showing the command, config and response would be too long I edited my original question and added what happened. Do You think that openshift in some way generates a bad certificate? It really beats me what could possibly be wrong here. – Matt May 21 '19 at 14:25