I have a pyspark dataframe (data). I need to separate the df by multiple columns and save them as csv to particular folders . The folder names will be based on the column name after partition.
PATH = '/../' + data['Col1'] + data[Col2] + data[Col3] + '/'
data.write.partitionBy(['Col1','Col2']).csv(PATH)
I have code like this but I know it has lot of errors. First, I want to split by multiple columns, then I want the folders to be created with same name as columne names. Can anyone please tell me how to rectify the code?