0

I have a project with Apache-Samza and I have a problem with duplicate data.

This is my checkpoint configuration :

task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
task.checkpoint.system=kafka
task.checkpoint.replication.factor=2
task.commit.ms=20000

On the documentation We can read this :

If task.checkpoint.factory is configured, this property determines how often a checkpoint is written. The value is the time between checkpoints, in milliseconds. The frequency of checkpointing affects failure recovery: if a container fails unexpectedly (e.g. due to crash or machine failure) and is restarted, it resumes processing at the last checkpoint. Any messages processed since the last checkpoint on the failed container are processed again. Checkpointing more frequently reduces the number of messages that may be processed twice, but also uses more resources.

So can I change task.commit.ms=20000 to 250ms or 1ms. It's good or very bad ? I have a very good cluster.

Why I need change this, because this Samza(worker) crash 1-3 time each week. And now the temporary solution is commit offset each time.


Documentation ref :

Appache-Samza

Apache-Samza-Configuration

MaximeF
  • 4,913
  • 4
  • 37
  • 51
  • Why does a program crash every week 1-3 times? Put lead around that computer – Bálint Aug 09 '16 at 18:15
  • I know the problem but it's not important. – MaximeF Aug 09 '16 at 18:55
  • https://xkcd.com/1495/ Closely related – Bálint Aug 09 '16 at 19:00
  • The problem it's a connection issue with server, so my cluster at US have x nodes and I have a samza(worker) connect to a another cluster to Europe. But the sysadmin He told me "I don't know where is the problem...." so for me it's very important I can fix right now the duplicate data. – MaximeF Aug 09 '16 at 19:07
  • You should just set it to 100 ms. Backuping takes the calculation's time away – Bálint Aug 09 '16 at 19:08
  • @MaximeF have you been running successfully at 100ms? – perkss Apr 04 '21 at 08:55

1 Answers1

0

My solution I know it's not the solution for all problem. It's change the task.commit.ms to the same thing of task.shutdown.ms=5000.

Atlas-Samza-Configuration Shutdown

MaximeF
  • 4,913
  • 4
  • 37
  • 51