How to count frequency of elements located in a small grids?

Question

Lots of data (3 column) in files such as:

longitude  latitude   count
20.12      50.45       1
35.78      24.26       1
20.48      50.16       2
...         ...       ...

Map (longitude and latitude) split many grids: 0.5*0.5 (size), for example:

longitude: [0, 0.5), [0.5, 1.0), ... , [179.5, 180.0)
latitude : [-90, -89.5), [-89.5, -89.0), ... , [89.5, 90.0]

Grids are formatted by 0.5*0.5 on the map.

for example: the 1st and 3rd records above are located the grid longitude [20.0, 20.5) and latitude [50.0, 50.5), so the counts=1+2=3.

So,How to program to get counts located in each grid from the data files with awk or other scripts? How to plot the result ?

What have you tried? Also, what do you mean by _"How to plot the result"_? — Jim Garrison, Nov 05 '12 at 04:54

amaurea · Answer 1 · 2012-11-04T18:28:19.310

I think this will do what you want:

awk 'function floor(x){
    y=int(x); return y>x?y-1:y
}{
    ilon=floor($1/0.5)
    ilat=floor($2/0.5)
    hist[ilat,ilon]+=$3
}END{
    for(ilat=-180;ilat<=180;ilat++)
        for(ilon=-360;ilon<=360;ilon++){
            printf(" %4d", hist[ilat,ilon])
        printf("\n")
    }
}'

Note: I hardcoded limits of lon [-180:180] and lat [-90:90] (including the step size). To be more general, you would calculate the integer limits of your array based on your current step size (I imagine you might want to use different steps than just 0.5 all the time), and lat/lon-range.

Note 2: The lack of useful predefined functions in awk shows here, with me needing to define floor myself, of all things. I wonder why the choice was made to exclude most of the C math functions.

Note 3: In case this isn't clear, the output of this will be a large matrix of hit counts for each cell, with one row for every 0.5 step in latitude, and one column for each such step in longitude.

How to count frequency of elements located in a small grids?

1 Answers1