I would like to convert one columns with dict values to expand columns with values as follows:
+-------+--------------------------------------------+
| Idx| value |
+-------+--------------------------------------------+
| 123|{'country_code': 'gb','postal_area': 'CR'} |
| 456|{'country_code': 'cn','postal_area': 'RS'} |
| 789|{'country_code': 'cl','postal_area': 'QS'} |
+-------+--------------------------------------------+
then i would like to get something like this:
display(df)
+-------+-------------------------------+
| Idx| country_code | postal_area |
+-------+-------------------------------+
| 123| gb | CR |
| 456| cn | RS |
| 789| cl | QS |
+-------+-------------------------------+
i just Try to do only for one line something like this:
#PySpark code
sc = spark.sparkContext
dict_lst = {'country_code': 'gb','postal_area': 'CR'}
rdd = sc.parallelize([json.dumps(dict_lst)])
df = spark.read.json(rdd)
display(df)
and i've got:
+-------------+-------------+
|country_code | postal_area |
+-------------+-------------+
| bg | CR |
+-------------+-------------+
so, here maybe i have part of the solution. now i would like to know hoy can i concat df with dataframe Result