0
  • nebula version: v3.5.0
  • Deployment method: distributed
  • Installation method: Docker
  • Whether to use the production environment: Y
  • hardware information
    • Disk SATA
    • 64C 128G

When submitting the spark-streaming program to consume topic data through spark-submit, assuming the spark program stops suddenly, the data with an offset of 100 has been consumed when it stops, and 100 pieces of data are sent to the topic at this time, and then spark is restarted -streaming consumes data, at which offset does Exchange start to consume Kafka at this time? According to our tracking, consumption starts from offset 200, which will cause data loss at offset 100-200.

How should I set it up if I want to continue spending from 100?

0 Answers0