Questions tagged [exactly-once]
39 questions
1
vote
1 answer
Snowflake connector duplicate records
According to the docs
Both Kafka and the Kafka connector are fault-tolerant.
Messages are neither duplicated nor silently dropped.
Messages are delivered exactly once, or an error message will be generated
we have in SF 2 records that have the…

Jonathan David
- 95
- 8
1
vote
1 answer
Can I rely on a in-memory Java collection in Kafka stream for buffering events by fine tuning punctuate and commit interval?
A custom processor which buffers events in a simple java.util.List in process() - this buffer is not a state store.
Every 30 seconds WALL_CLOCK_TIME, punctuate() sorts this list and flushes to the sink. Assume only single partition source and sink.…

Vinodhini Chockalingam
- 306
- 2
- 17
1
vote
1 answer
Exactly-once: who is storing the historical data, flink or the data source
I've known that Apache Flink have the capacity of Exactly once, which relies on the checkpoint mechanism and the resendable data source.
As my understanding, if an operator of Flink gets some error, it needs to make its last operation to run again,…

Yves
- 11,597
- 17
- 83
- 180
1
vote
0 answers
Testing Exactly-Once with Apache Beam + SparkRunner in local environment
I am running below beam pipeline which reads from a local Kafka INPUT_TOPIC and writes into another local Kafka OUTPUT_TOPIC. I created a publisher to feed INPUT_TOPIC (manually) and a consumer to check what I am getting on the OUTPUT_TOPIC but…

Praveen Viswanathan
- 31
- 2
1
vote
1 answer
Kafka - Loosing messages even if app is configured for exactly once and highest durability
There are cases (very rarely, but there are) when I receive duplicates, even if everything is configured for high durability and we use exactly once configuration.
Please check below the application context and test scenario that causes this…

Cristi
- 180
- 15
0
votes
1 answer
Flink Kafka Connector with DeliveryGuarantee.EXACTLY_ONCE producing duplicate messages
My Verions:
flink.version 1.15.2
scala.binary.version 2.12
java.version 1.11
My Code:
`
public static void main(String[] args) throws Exception { StreamExecutionEnvironment env =…

Kanad Mehta
- 1
- 3
0
votes
0 answers
Guarantee exactly once Spark Structured streaming foreachbatch
I need to guarantee exactly once series of writes in a foreachbatch.
For example I've:
a stream with two writes in HBase and one on HDFS
two writes on HDFS in different folders
I want to write all only when I'm sure that every operation will pass,…

D. belvedere
- 45
- 2
- 8
0
votes
1 answer
Kafka Streams exactly_once_v2 produce duplicates after application restart
Please, we have kafka streams application with processing_quarantee=exactly_once_v2.
Kafka version: 3.2.0
Kafka Streams version: 3.0.1
Confluent version 7.0.1
Another configurations necessary for exactly once processing are also…

Ivet
- 1
- 1
0
votes
1 answer
Does flink streaming job maintain its keyed value state between job runs?
Our usecase is we want to use flink streaming for a de-duplicator job, which reads it's data from source(kafka topic) and writes unique records into hdfs file sink.
Kafka topic could have duplicate data, which can be identified by using composite…

S Mishra
- 57
- 10
0
votes
0 answers
How to deal with exactly-once http request
I am doing a POC to implement a system like this:
User: has credit
A server will call a http request on user's behalf, when the response status returns 200 -> mark it as completed and subtract user's credit
The HTTP request could be a very critical…

end_lesslove2012
- 1
- 2
0
votes
0 answers
Is spark structured streaming exactly-once when partitioning by system time?
Let's say I have a kafka topic without any duplicate messages.
If I consumed this topic with spark structured streaming and added a column with currentTime() and partitioned by this time column and saved records to s3 would there be a risk of…

Konrad Paniec
- 11
- 1
0
votes
0 answers
How to implement exactly-once on Apache Kafka using Python?
I am trying to implement an exactly-once concept using confluent kafka library. But I couldn't succeed. Here is my producer code:
from confluent_kafka import Producer, SerializingProducer
import socket, json, random, time
conf =…

Yunus Emrah Uluçay
- 129
- 9
0
votes
1 answer
Invalid transition attempted from state COMMITTING_TRANSACTION to state ABORTING_TRANSACTION in Producer
I am trying to achieve exactly once functionality but getting KafkaException with message as "org.apache.kafka.common.KafkaException: TransactionalId db13196c-6974-48b0-9835-aed40cec4ca4: Invalid transition attempted from state…

Harshit Vijayvargia
- 173
- 2
- 12
0
votes
1 answer
Kafka EOS retry flag
I have a Kafka cluster and a spring boot application that is configured for EOS. The application consumes from topic A performs some business logic then produces to topic B.
The issue i am facing if EOS fails to write to topic B it retries and all…

Aleshan
- 31
- 6
0
votes
1 answer
kSqlDB Exactly Once Processing Guarantee
I was testing the exactly once semantics on ksqldb server by very un-graceful shutdown of docker running process or letting the docker container to run out of memory. In both cases I receive duplicates which definitely is not the guaranteed…

Nikki
- 404
- 4
- 14