0

I am trying to write a simple test for an abstraction of the kafka scala client in kafka 0.8.2. It basically just writes a message to kafka and I then try to read it back. However, I had problems with it failing intermittantly so I boiled the test code down to the code below. This test sometimes (rarely) passes and sometimes fails. What am I doing worng?

package mykafkatest

import java.net.ServerSocket
import java.nio.file.Files
import java.util.{UUID, Properties}

import kafka.consumer.{Whitelist, ConsumerConfig, Consumer}
import kafka.producer.{ProducerConfig, Producer, KeyedMessage}
import kafka.serializer.StringDecoder
import kafka.server.KafkaConfig
import kafka.server.KafkaServerStartable
import org.apache.curator.test.TestingServer

import scala.concurrent.{Await, Future}
import scala.concurrent.duration._

class KafkaSenderTest extends org.scalatest.FunSpecLike with org.scalatest.ShouldMatchers with org.scalatest.BeforeAndAfterAll {

  import scala.concurrent.ExecutionContext.Implicits.global
  val zkServer = new TestingServer()

  val socket = new ServerSocket(0)
  val port = socket.getLocalPort.toString
  socket.close()
  val tmpDir = Files.createTempDirectory("kafka-test-logs")

  val serverProps = new Properties
  serverProps.put("broker.id", port)
  serverProps.put("log.dirs", tmpDir.toAbsolutePath.toString)
  serverProps.put("host.name", "localhost")
  serverProps.put("zookeeper.connect", zkServer.getConnectString)
  serverProps.put("port", port)

  val config = new KafkaConfig(serverProps)
  val kafkaServer = new KafkaServerStartable(config)

  override def beforeAll ={
    kafkaServer.startup()
  }

  override def afterAll = {
    kafkaServer.shutdown()
  }

  it("should put messages on a kafka queue") {
    println("zkServer: " + zkServer.getConnectString)
    println("broker port: " + port)

    val consumerProps = new Properties()
    consumerProps.put("group.id", UUID.randomUUID().toString)
    consumerProps.put("zookeeper.connect", zkServer.getConnectString)

    val consumerConnector = Consumer.create(new ConsumerConfig(consumerProps))
    val topic = "some-topic"
    val filterSpec = new Whitelist(topic)
    val stream = consumerConnector.createMessageStreamsByFilter(filterSpec, 1, new StringDecoder, new StringDecoder).head

    val producerProps = new Properties()
    producerProps.put("metadata.broker.list","localhost:"+port)

    val sender = new Producer[Array[Byte], Array[Byte]](new ProducerConfig(producerProps))
    val keyedMessage = new KeyedMessage[Array[Byte], Array[Byte]](topic, "awesome message".getBytes("UTF-8"))
    sender.send(keyedMessage)

    val msg = Await.result(Future { stream.take(1) }, 5 seconds)
    msg.headOption should not be(empty)

  }
}

EDIT: I have created a new project with the following build.sbt and the above code as a test class.

name := "mykafkatest"

version := "1.0"

scalaVersion := "2.11.5"


libraryDependencies ++= Seq(
  "org.apache.kafka" %% "kafka" % "0.8.2.0",

  "org.scalatest" %% "scalatest" % "2.2.2" % "test",
  "org.apache.curator" % "curator-test" % "2.7.0" % "test"
)

And the test seem to pass more often, but it still fails intermittently...

Emil L
  • 20,219
  • 3
  • 44
  • 65
  • What's the error message? 1. Is it possible `val stream` is using String while producer is using Array[byte]? 2. Does it help if you put `Thread.sleep(5)` after `sender.send` ? – digit plumber Jun 24 '15 at 06:50
  • @cppinitiator When there is an error it is from `Await.result(Future { stream.take(1) }, 5 seconds)` giving a `TimeoutException`. So sometimes this test passes (and it does so quickly), but most often it fails... Adding the Thread.sleep(5) after send doesn't help. If it was an error due to String/Array[Byte] I don't think the test would ever pass. – Emil L Jun 24 '15 at 20:08
  • It seems you are using StringDecoder, but sender takes Array[Byte]. How about changing both `val sender` and `val keyedMessage` to [String, String]? – digit plumber Jun 25 '15 at 05:08
  • @cppinitiator I tried it, but it doesn't help. Besides, I imagine I would be getting a consistent error if this was due to not being able to decode the message. The fact that it succeeds sometimes and sometimes not for the same `"awesome message"` indicates to me that it is somekind of race condition. I just don't know how to isolate it... – Emil L Jun 25 '15 at 21:42

2 Answers2

4

You may have a race condition leading to the consumer actually finishing its initialization after the message is sent, and then ignoring the message since it start at largest offset by default.

Try adding

consumerProps.put("auto.offset.reset", "smallest")

to your consumer properties

C4stor
  • 8,355
  • 6
  • 29
  • 47
  • This seem to work. For my benefit, if I add a `Thread.sleep(XXX)` after the creation of the `stream` would that have the same result? I tried that but that seem to cause the test to fail consistently – Emil L Jul 21 '15 at 22:47
  • It would have the same result with a random probability (increasing with the time of the sleep), which is not as good. Also, sleeping in a test increases the time needed to run it. What you can do if you really don't want to use smallest offset is send "warmup" messages until you succesfully receive one, and then start sending a fixed batch of message(s) on which you do your actual testing. But it seems a lot more complicated ^^ – C4stor Jul 22 '15 at 07:37
  • 2
    it is `earliest`, not `smallest`. `consumerProps.put("auto.offset.reset", "earliest")` – EmeraldTablet Jun 29 '18 at 16:28
  • 1
    Hi ! The question is for kafka 0.8.2.0 for which "smallest" is the correct answer. It was deprecated in 0.9.0+ I believe :-) – C4stor Jun 29 '18 at 18:46
0

I think this is some sort of message buffering issue. If you send 200 messages this works (for me):

(1 to 200).foreach(i => sender.send(keyedMessage))

199 messages fails. I tried changing configs around but couldn't find any magic to make 1 message work, though I'm sure there's some set of configs that could make this work.

Noah
  • 13,821
  • 4
  • 36
  • 45