I'm just starting my adventure with DASK
and land I'm learning on an example dataset in json format. I know that this is not the easiest data format in the world for a beginner :)
I have a dataset in the json
format. I loaded the data via dd.read_json
to dataframe and everything goes well. The problem occurred with, for example, the compute()
or len()
function.
I get this error:
ValueError: Metadata mismatch found in `from_delayed`.
Partition type: `DataFrame`
+----------+-------+----------+
| Column | Found | Expected |
+----------+-------+----------+
| column1 | - | object |
| column2 | - | object |
+----------+-------+----------+
I tried different things, but nothing helps. I don't know how to handle this error.
Please help, I will be very grateful !