Let's say I have a kafka topic without any duplicate messages.
If I consumed this topic with spark structured streaming and added a column with currentTime() and partitioned by this time column and saved records to s3 would there be a risk of creating duplicates in s3 in case of some failures?
Or spark is smart enough to deliver these messages exactly once?