I want to load the first 10 XML files in each iteration from a directory containing 100 files and remove that XML file that has already read, to another directory.
what I have tried so far in pyspark.
li = ["/mnt/dev/tmp/xml/100_file/M800143.xml","/mnt/dev/tmp/xml/100_file/M8001422.xml"]
df1 = spark.read.format("com.databricks.spark.xml").option("rowTag","Quality").load(li)
df1.show()
But I am getting an error : IllegalArgumentException: 'path' must be specified for XML data.
Is there is any way to read files after storing the full path of XML files inside the list? Or please suggest another approach.