I am new to PySpark.
I have read a parquet file. I only want to keep columns that have atleast 10 values
I have used describe to get the count of not-null records for each column
How do I now extract the column names that have less than 10 values and then drop those columns before writing to a new file
df = spark.read.parquet(file)
col_count = df.describe().filter($"summary" == "count")