1

I have two csv files. I was trying to merge them. The fist csv file 'Test_base1' looks like:

OBJECTID,STATEFP,COUNTYFP,TRACTCE,GEOID 

1,12,105,10300,15000US121050103002 

2,12,103,24804,15000US121030248041

3,12,105,10800,15000US121050108001 

The second csv file 'Test_file23_1' looks like:

GEOID,B23003e1,B23003m1,B23003e2,B23003m2,B23003e3

15000US121050103002,69,81,13,21,13

15000US121030248041,657,248,62,79,0

15000US120010004001,410,143,261,126,47  

while I was merging them with reference to field 'GEOID', I am getting an error: KeyError: 'GEOID'. But the 'GEOID' is present in the column name. Whats the solution? What is the problem with my code?

Update: It's not a coding error. Had a chat with @Zev. We figured it out that, it's not a coding problem. If I copy the data of files to a new black file, and create it, it worked. But I we do not know the cause behind it. Zev tried with the same file. No error on his side. But same error in my side. Files: 1. https://www.dropbox.com/s/rg706ck6bdda4sa/Test_Base1.csv?dl=0 2. https://www.dropbox.com/s/yf49f2p7btkc56r/Test_File23_1.csv?dl=0

    import pandas as pd, numpy as np
    from functools import reduce

    df1=pd.read_csv("Test_Base1.csv")
    df2=pd.read_csv("Test_File23_1.csv")
    dfs = [ df1, df2]
    df_merged = reduce(lambda  left,right: pd.merge(left,right,on=['GEOID'],how='outer'), dfs)
    df_merged.to_csv('test1.csv', index=False)

the exact error: enter image description here

Latika Agarwal
  • 973
  • 1
  • 6
  • 11
Tashaho
  • 163
  • 1
  • 2
  • 11
  • Please post your exact error. Also, can you post the text (open the file in notepad or Text Editor) of just the first 3 lines of each file? I wasn't able to reproduce your error. The files (I just added a few of the relevant headings and one row of data) successfully combined for me. – Zev Jun 14 '18 at 03:52
  • added screenshot – Tashaho Jun 14 '18 at 04:03
  • @Zev added the screenshot of error and pasted files. – Tashaho Jun 14 '18 at 04:12
  • Still works for me. I get a file like `OBJECTID,STATEFP,COUNTYFP,TRACTCE,GEOID,B23003e1,B23003m1,B23003e2,B23003m2,B23003e3` followed by `1.0,12.0,105.0,10300.0,15000US121050103002,69.0,81.0,13.0,21.0,13.0` and so on. – Zev Jun 14 '18 at 04:14
  • Are you getting that error from this exact code? – Zev Jun 14 '18 at 04:17
  • the exact code: "import pandas as pd, numpy as np from functools import reduce import os os.getcwd() os.chdir('C:/Users/shtanim/Desktop/Popgen2.0/Hernando_county/ACS_data_Hernando/') df1=pd.read_csv("Test_Base1.csv") df2=pd.read_csv("Test_File23_1.csv") dfs = [ df1, df2] df_merged = reduce(lambda left,right: pd.merge(left,right,on=['GEOID'],how='outer'), dfs) df_merged.to_csv('test1.csv', index=False)" – Tashaho Jun 14 '18 at 04:20
  • everything same, I just have the changing directory. All other part of script is same – Tashaho Jun 14 '18 at 04:21
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173108/discussion-between-zev-and-tashaho). – Zev Jun 14 '18 at 04:22
  • looks like `df1.merge(df2,on='GEOID')` works fine. what is your desired output? – nimrodz Jun 14 '18 at 04:29

0 Answers0