Questions tagged [exactly-once]

39 questions
1
vote
1 answer

Snowflake connector duplicate records

According to the docs Both Kafka and the Kafka connector are fault-tolerant. Messages are neither duplicated nor silently dropped. Messages are delivered exactly once, or an error message will be generated we have in SF 2 records that have the…
1
vote
1 answer

Can I rely on a in-memory Java collection in Kafka stream for buffering events by fine tuning punctuate and commit interval?

A custom processor which buffers events in a simple java.util.List in process() - this buffer is not a state store. Every 30 seconds WALL_CLOCK_TIME, punctuate() sorts this list and flushes to the sink. Assume only single partition source and sink.…
1
vote
1 answer

Exactly-once: who is storing the historical data, flink or the data source

I've known that Apache Flink have the capacity of Exactly once, which relies on the checkpoint mechanism and the resendable data source. As my understanding, if an operator of Flink gets some error, it needs to make its last operation to run again,…
Yves
  • 11,597
  • 17
  • 83
  • 180
1
vote
0 answers

Testing Exactly-Once with Apache Beam + SparkRunner in local environment

I am running below beam pipeline which reads from a local Kafka INPUT_TOPIC and writes into another local Kafka OUTPUT_TOPIC. I created a publisher to feed INPUT_TOPIC (manually) and a consumer to check what I am getting on the OUTPUT_TOPIC but…
1
vote
1 answer

Kafka - Loosing messages even if app is configured for exactly once and highest durability

There are cases (very rarely, but there are) when I receive duplicates, even if everything is configured for high durability and we use exactly once configuration. Please check below the application context and test scenario that causes this…
Cristi
  • 180
  • 15
0
votes
1 answer

Flink Kafka Connector with DeliveryGuarantee.EXACTLY_ONCE producing duplicate messages

My Verions: flink.version 1.15.2 scala.binary.version 2.12 java.version 1.11 My Code: ` public static void main(String[] args) throws Exception { StreamExecutionEnvironment env =…
0
votes
0 answers

Guarantee exactly once Spark Structured streaming foreachbatch

I need to guarantee exactly once series of writes in a foreachbatch. For example I've: a stream with two writes in HBase and one on HDFS two writes on HDFS in different folders I want to write all only when I'm sure that every operation will pass,…
0
votes
1 answer

Kafka Streams exactly_once_v2 produce duplicates after application restart

Please, we have kafka streams application with processing_quarantee=exactly_once_v2. Kafka version: 3.2.0 Kafka Streams version: 3.0.1 Confluent version 7.0.1 Another configurations necessary for exactly once processing are also…
Ivet
  • 1
  • 1
0
votes
1 answer

Does flink streaming job maintain its keyed value state between job runs?

Our usecase is we want to use flink streaming for a de-duplicator job, which reads it's data from source(kafka topic) and writes unique records into hdfs file sink. Kafka topic could have duplicate data, which can be identified by using composite…
0
votes
0 answers

How to deal with exactly-once http request

I am doing a POC to implement a system like this: User: has credit A server will call a http request on user's behalf, when the response status returns 200 -> mark it as completed and subtract user's credit The HTTP request could be a very critical…
0
votes
0 answers

Is spark structured streaming exactly-once when partitioning by system time?

Let's say I have a kafka topic without any duplicate messages. If I consumed this topic with spark structured streaming and added a column with currentTime() and partitioned by this time column and saved records to s3 would there be a risk of…
0
votes
0 answers

How to implement exactly-once on Apache Kafka using Python?

I am trying to implement an exactly-once concept using confluent kafka library. But I couldn't succeed. Here is my producer code: from confluent_kafka import Producer, SerializingProducer import socket, json, random, time conf =…
0
votes
1 answer

Invalid transition attempted from state COMMITTING_TRANSACTION to state ABORTING_TRANSACTION in Producer

I am trying to achieve exactly once functionality but getting KafkaException with message as "org.apache.kafka.common.KafkaException: TransactionalId db13196c-6974-48b0-9835-aed40cec4ca4: Invalid transition attempted from state…
0
votes
1 answer

Kafka EOS retry flag

I have a Kafka cluster and a spring boot application that is configured for EOS. The application consumes from topic A performs some business logic then produces to topic B. The issue i am facing if EOS fails to write to topic B it retries and all…
0
votes
1 answer

kSqlDB Exactly Once Processing Guarantee

I was testing the exactly once semantics on ksqldb server by very un-graceful shutdown of docker running process or letting the docker container to run out of memory. In both cases I receive duplicates which definitely is not the guaranteed…
Nikki
  • 404
  • 4
  • 14