I am working on spark structured streaming project and i am facing one issue in chechpoint.
In our hdfs we have 25 days retention policy and its day wise partitions and we will delete the files from hdfs on daily basis but in my spark streaming my checkpnt files save all the file names from the job starting but if i cleanup my checkpnt directory i need to start my job again for 25 days so i need to drop my checkpnt files based on my retention policy but latest .compact file in checkpnt stores all the file names from starting please help me to resolve this issue.