1

I'm working on a NestJS project (hybrid application) and we use KafkaJS to exchange some data between our microservices and in some cases, the order in which those data are sent to other services is very important as it cannot process the second message without the first one. The thing is, our production order object has a property which is an array of objects (orderedArray) that are correctly being ordered by one of it's properties (count), it looks like this:

{
    "productionOrderId": '...',
    "property1": '...',
    "property2": '...',
    "orderedArray": [
        {
            "id": '...',
            "timestamp": '...',
            "count": 1
        },
        {
            "id": '...',
            "timestamp": '...',
            "count": 2
        },
        {
            "id": '...',
            "timestamp": '...',
            "count": 3
        }
    ]
}

In one specific feature, we receive a production order creation request (REST) from our frontend and after some validation, we save it on our database. The last step after that is to send the production order to another service, which we do so by using Kafka, the real issue is: We are saving it correctly to our database, and by debugging our application, I found out that, until we call KafkaJS method to send messages, it still in the same numeric order we saved, so I'm 100% sure we are sending it in the correct order. When we receive that order in other services, it's in a different order, to be more specific, the last two items of that array have exchanged their ordenation, like this (take a look at out the count field):

    "orderedArray": [
        {
            "id": '...',
            "timestamp": '...',
            "count": 1
        },
        {
            "id": '...',
            "timestamp": '...',
            "count": 3
        },
        {
            "id": '...',
            "timestamp": '...',
            "count": 2
        }
    ]

That happens with any amount of records, but only with the last two items of the array, so:

  • 1 2 3 becomes 1 3 2
  • 1 2 becomes 2 1
  • 1 2 3 4 5 6 7 becomes 1 2 3 4 5 7 6
  • 1 2 3 ... 99 100 becomes 1 2 3 ... 100 99

This is how we build our message (don't mind the typying, I'm about to refactor this once I figure this problem out):

    // KafkaService
    this.kafkaClient.emit(message.topic, message.messages),

    // OrderService
    for (const order of productionOrders) {
      this.sendMessage(
        kafkaUtilities.buildOrderedMessage(
          'my-topic-here', Array.of(order),
          'my-key-here')
      );
    }

  // KafkaUtilities
  private createMessage(topic: string, data: Array<unknown>, key?: string): ProducerRecord {
    const content = {
      value: JSON.stringify(data),
    } as any

    if (key) {
      this.timestamp += 1000

      content.key = key
      content.timestamp = this.timestamp.toString()
    }

    return { topic, messages: content }
  }

I'm adding one second to the timestamp for each record we need to send. Now, the only thing that makes ordenation work is to add a "sleep" function inside my for loop, but it just needs to be there, it works even with 0.5ms:

    function sleep(ms: number) {
      return new Promise((resolve) => {
        setTimeout(resolve, ms);
      });
    }

Our Kafka configs:

   options: {
      client: {
        brokers: ['my-broker'],
        connectionTimeout: 4000,
        logLevel: logLevel[env.KAFKA_ENABLE_LOG ? 'DEBUG' : 'NOTHING'],
        sasl: getSasl(enableSecurity),
        ssl: enableSecurity,
        requestTimeout: 90000,
      },
      consumer: {
        groupId: 'my-id',
        heartbeatInterval: 3000,
        metadataMaxAge: 180000,
        sessionTimeout: 60000,
        retry: {
          initialRetryTime: 30000,
          retries: 578,
          multiplier: 2,
          maxRetryTime: 300000,
          factor: 0,
        },
      },
      producer: {
        metadataMaxAge: 180000,
      },
    },

We've spent some time trying to figure out what was causing issue, but we're not certain on what could be causing it yet. Here's what I've tried so far (some stuff might not make much sense without context):

  • Setting maxInFlightRequests to 1
  • Setting key and partition to messages (tried the same for every message and a sequential one for each message)
  • Creating a new repository and a new topic (still had the same issue)
  • Trying using the same Kafka instance we use on our cloud environment

What I'm about to try (and will update this question asap):

  • Downgrading KafkaJS
  • Testing with a pure NodeJS project

Our project information: NestJS version: 8.2.6 KafkaJS version: 1.15.0 On our cloud environment we use Event Hubs events as a Kafka provider, but it also happens locally with Kafka.

My guess this is a KafkaJS related issue, but I'm currently only guessing.

Lucas
  • 491
  • 6
  • 21

1 Answers1

-1

Messages ordering in Kafka is guaranteed only within a partition, meaning a consumer will consume messages in the same order as they have been produced within a same partition (at the opposite if two messages are produced on two different partitions, then ordering is not guaranteed).

So what you could check is the number of partitions you have in your Kafka topic. If you have more than one and if the partition key is not always the same for each message then it's normal that messages are disordered when consumed.

If ordering is an hard requirement in your context then you could:

  1. Reduce the number of partitions to 1. But it's probably a wrong idea here since you would lose scalability.
  2. Or, if your business requirement is to only order messages within the same productionOrder, then you could use the productionOrderId as partition key when producing messages (I'm assuming that you are currently using the message id as partition key from your code but I'm not a KafkaJS expert). In this situation all messages that belong to the same productionOrder will go to the same partition (and get consumed in adequate order).
Alex
  • 17
  • 1
  • 5
  • As I mentioned on my question, I've tried ordering all messages with the same keys as well as with the same key AND with the same and different partition, locally, I'm running one consumer, so for testing purposes, I'm always sending it to the partition 0. I'm aware of partitions and keys, I'm pretty convinced this is an issue on KafkaJS or NestJS, not sure though. – Lucas Jul 25 '22 at 20:51