How can we convert pyspark dataframe column values to list in DLT
I tried using collect()
, toPandas()
,collect_list(), toLocalIterator to converst dataframe df_year(has a column with year data only) to list but it is not supported in dlt pipeline. It's returing output as an empty list
years = [row.YEAR for row in df_year.collect() if row.YEAR is not None]
import pandas as pd
years = df_year.select("YEAR").toPandas()["YEAR"].tolist()