-2

I have two dataframes with two groups of stations information. One for the 15 small stations and another for the 5 main stations.

Small Station Information(15*3):

       SmallStation_ID  longitude  latitude
0           dongsi_aq    116.417    39.929
1          tiantan_aq    116.407    39.886
2         guanyuan_aq    116.339    39.929
3    wanshouxigong_aq    116.352    39.878
4     aotizhongxin_aq    116.397    39.982
5     nongzhanguan_aq    116.461    39.937
6           wanliu_aq    116.287    39.987
7       beibuxinqu_aq    116.174    40.090
8        zhiwuyuan_aq    116.207    40.002
9   fengtaihuayuan_aq    116.279    39.863
10         yungang_aq    116.146    39.824
11         gucheng_aq    116.184    39.914
12        fangshan_aq    116.136    39.742
13          daxing_aq    116.404    39.718
14        yizhuang_aq    116.506    39.795

Main Station Information(5*9):

    MainStation_id   longitude   latitude  temperature  \
0       shunyi_meo  116.615278  40.126667         -1.7   
1       hadian_meo  116.290556  39.986944         -1.6   
2      yanqing_meo  115.968889  40.449444         -8.8   
3        miyun_meo  116.864167  40.377500         -6.6   
4      huairou_meo  116.626944  40.357778         -5.2 



       pressure  humidity  wind_direction  wind_speed      weather  
0        1028.7        15           215.0         1.6  Sunny/clear  
1        1026.1        14           231.0         2.5  Sunny/clear  
2         970.8        35           305.0         0.8         Haze  
3        1023.3        28        999017.0         0.2         Haze  
4        1022.8        27            30.0         0.8  Sunny/clear

I want to classify those small stations into the main stations after I calculate the distance: sqrt((x1-x2)^2+(y1-y2)^2)(Here the x and y are longitude and latitude, respectively). Find the closest neighbor. Then merge these two groups of dataframes to get the main stations external weather information. The head of the final dataframe will seems like,

 SmallStation_ID  distance  temperature  pressure  humidity  wind_direction  wind_speed      weather

It is a 15*8 dataframe.

Hope I made this question clear. Thanks!

Jiayu Zhang
  • 719
  • 7
  • 22

1 Answers1

0

Alright... I solved this by myself...

I am not familiar with iteration and hope someone gives me an easier understand way in some pandas functions.

Small Station is a dataframe called aqstation.

Main Station is a dataframe called meostation.

l = []
# All I want to do is to merge Main Station weather information into Small Stations...

Calculate the Euclidean distance then classify small stations into main stations.

for i in range(len(aqstation)):

 station = meostation['station_id'][(((aqstation['longitude'][i]-meostation['longitude'])**2+(aqstation['latitude'][i]-meostation['latitude'])**2)**(0.5)).idxmin()]
 l.append(station)

# print(len(l))
aqstation['station_id'] = l


del aqstation['longitude']
del aqstation['latitude']
del meostation['longitude']
del meostation['latitude']

Merge the two dataframes.

aqstation = pd.merge(aqstation, meostation, how='left', on='station_id')
print(aqstation.head(10))

          Station ID       station_id    temperature  \
0          dongsi_aq     chaoyang_meo           -0.7   
1         tiantan_aq      beijing_meo           -2.5   
2        guanyuan_aq       hadian_meo           -1.6   
3   wanshouxigong_aq      fengtai_meo           -1.4   
4    aotizhongxin_aq       hadian_meo           -1.6   
5    nongzhanguan_aq     chaoyang_meo           -0.7   
6          wanliu_aq       hadian_meo           -1.6   
7      beibuxinqu_aq    pingchang_meo           -3.0   
8       zhiwuyuan_aq  shijingshan_meo           -1.8   
9  fengtaihuayuan_aq      fengtai_meo           -1.4   

   pressure  humidity  wind_direction  wind_speed      weather  
0    1027.9        13           239.0         2.7  Sunny/clear  
1    1028.5        16           225.0         2.4         Haze  
2    1026.1        14           231.0         2.5  Sunny/clear  
3    1025.2        16           210.0         1.4  Sunny/clear  
4    1026.1        14           231.0         2.5  Sunny/clear  
5    1027.9        13           239.0         2.7  Sunny/clear  
6    1026.1        14           231.0         2.5  Sunny/clear  
7    1022.5        17           108.0         1.1  Sunny/clear  
8    1024.0        12           201.0         2.5  Sunny/clear  
9    1025.2        16           210.0         1.4  Sunny/clear

My code is very lengthy. Hope someone can make it simpler.

Jiayu Zhang
  • 719
  • 7
  • 22