1

I am trying to concat two following two dask dataframes:

df_temp = [
    ['A','B','C','D','E','F'],
    [1, 4, 8, 1, 3, 5],
    [6, 6, 2, 2, 0, 0],
    [9, 4, 5, 0, 6, 35],
    [0, 1, 7, 10, 9, 4],
    [0, 7, 2, 6, 1, 2]
    ]

df_all = [
    ['C','D','E','F','G','H'],
    [1, 4, 8, 1, 3, 5],
    [6, 6, 2, 2, 0, 0],
    [9, 4, 5, 0, 6, 35],
    [0, 1, 7, 10, 9, 4],
    [0, 7, 2, 6, 1, 2]
    ]
df_all = dd.concat([df_temp,df_all])

What am I missing here? I am getting the following error:

File ~\AppData\Local\anaconda3\Lib\site-packages\spyder_kernels\py3compat.py:356 in compat_exec exec(code, globals, locals)

  File e:\projects\2023\long term projection analysis\fg & fgi pyval\run outputs\aggregated outputs\cf aggregation production\aggregation_v4 monthly fgi.py:77
    df_all = df_all.append(df_temp)

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\core.py:5805 in append
    return super().append(other, interleave_partitions=interleave_partitions)

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\core.py:3501 in append
    return concat(

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\multi.py:1329 in concat
    return stack_partitions(

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\multi.py:1104 in stack_partitions
    needs_astype = [

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\multi.py:1107 in <listcomp>
    if df[col].dtype != meta[col].dtype

 

  File ~\AppData\Local\anaconda3\Lib\site-packages\dask\dataframe\core.py:4904 in __getitem__
    raise NotImplementedError(key)

I was not getting this error on a different computer. I am now using a VM with a different environment and I am not able to run this code.

N27
  • 31
  • 5
  • Could you also give the Dask versions you are using? – Guillaume EB Aug 03 '23 at 11:41
  • @GuillaumeEB I have 2.11.0 on my personal laptop, my VM says I have 2023.6.0. I am super confused as those versions look very different. – N27 Aug 30 '23 at 14:22
  • @GuillaumeEB It seems like in older versions of dask or in pandas when you would concat two dataframes for the non overlapping columns there would be "na" as the row entry. in the newest version, assuming 2023.6.0 is newer, it just throws a key error. I cannot find a workaround anywhere, and I need to join about 20 frames with rolling columns that have partial overlap. – N27 Aug 31 '23 at 18:34

0 Answers0