I have a cell grid of big dimensions. Each cell has an ID (p1
), cell value (p3
) and coordinates in actual measures (X
, Y
). This is how first 10 rows/cells look like
p1 p2 p3 X Y
0 0 0.0 0.0 0 0
1 1 0.0 0.0 100 0
2 2 0.0 12.0 200 0
3 3 0.0 0.0 300 0
4 4 0.0 70.0 400 0
5 5 0.0 40.0 500 0
6 6 0.0 20.0 600 0
7 7 0.0 0.0 700 0
8 8 0.0 0.0 800 0
9 9 0.0 0.0 900 0
Neighbouring cells of cell i
in the p1
can be determined as (i-500+1
, i-500-1
, i-1
, i+1
, i+500+1
, i+500-1
).
For example: p1
of 5 has neighbours - 4,6,504,505,506. (these are the ID of rows in the upper table - p1
).
What I am trying to is:
For the chosen value/row i
in p1
, I would like to know all neighbours in the chosen distance from i
and sum all their p3
values.
I tried to apply this solution (link), but I don't know how to incorporate the distance parameter. The cell value can be taken with df.iloc
, but the steps before this are a bit tricky for me.
Can you give me any advice?
EDIT:
Using the solution from Thomas and having df called CO
:
p3
0 45
1 580
2 12000
3 12531
4 22456
I'd like to add another column and use the values from p3
columns
CO['new'] = format(sum_neighbors(data, CO['p3']))
But it doesn't work. If I add a number instead of a reference to row CO['p3']
it works like charm. But how can I use values from p3 column automatically in format
function?
SOLVED: It worked with:
CO['new'] = CO.apply(lambda row: sum_neighbors(data, row.p3), axis=1)