-1

I've been struggling with some matrix calculation in sas. I have two tables, which both contain Longitude and Latitude information

Tasks table

enter image description here

Engineers table:

enter image description here

My output goal is the distance between tasks and engineers

Distances:

enter image description here

Due to the additional cost for SAS/IML, I don't have it on premises. Does someone have any experience how to do that?

P.S. The code in R is easy as follows

   distances <- matrix(nrow = NROW(tasks), ncol = NROW(cses))
   for (i in 1:NROW(tasks)){
     for(j in 1:NROW(cses)){
       distances[i,j] <- distm (c(tasks$CUST_LNG_X[i], tasks$CUST_LAT_Y[i]), c(cses$LNG_X[j], cses$LAT_Y[j]), fun = distHaversine)

     }
   }
ZYJ
  • 57
  • 6
  • Please do not post images in the future, include your data like the users below did as a data step. – Reeza Oct 19 '17 at 22:05

3 Answers3

1

Well there is the GEODIST function which does just that.

Assuming the tables called tasks and engineers with the structure you present in your post (for future reference, we rather have those as text instead of images), you can obtain the third table with the following:

proc sql;
create table engineers_2 as
select compress(id||'_y') as id
      ,geo_y as geo
from engineers
union
select compress(id||'_x') as id
      ,geo_x as geo
from engineers
order by id
;
quit;

proc transpose data=engineers_2 out=t_engineers(drop=_name_);
var geo;
id id;
run;

proc sql;
create table want as
select *
from tasks
    ,t_engineers
;
quit;

This will give you a table that will have the following columns

task_id geo_y geo_x E_1_x E_1_y E_2_x E_2_y E_3_x E_3_y

where geo_y and geo_x are the coordinates of the task task_id and E_n_x and E_n_y are the coordinates of engineer n.

It is then just a matter of processing the dataset line by line, feeding the variables in the geodist function.

user2877959
  • 1,792
  • 1
  • 9
  • 14
1

Maybe I'm missing something but a cross join should also work fine with the GEODIST function and then a transpose to make it wide. It doesn't matter how many points are in each file, though I suppose if it got big enough SQL may error out due to memory.

proc sql;
create table want as
select task_id as col1, id as col2, geodist(a.x, a.y, b.x, b.y) as distance
from tasks as a , engineers as b;
quit;

proc transpose data=want out=want_wide ;
by col1;
id col2;
var distance;
run;
Reeza
  • 20,510
  • 4
  • 21
  • 38
  • This. It was late when I answered and as soon as I hit the bed I realized it was much (much) simpler than that. – user2877959 Oct 20 '17 at 04:56
0

This is very similar to user2877959.

data tasks;
input Task_ID $ Y X;
datalines;
T_1 41.65 -74.08
T_2 32.48 -86.46
;
run;

%let n=2; /*This is the number of engineer points*/

data engineers;
input ID $ Y X;
datalines;
E_1 40.92 -81.31
E_2 39.98 -74.87
;
run;

I differ by putting creating the X_ and Y_ are suffixes. This will make things easier. I'm doing this in a data step and then transposing. Using SQL works too.

data engineers;
set engineers;
X_NAME = catt("X_",id);
Y_NAME = catt("Y_",id);
run;

proc transpose data=engineers(keep=x_name x) out=e_x;
id x_name;
run;

proc transpose data=engineers(keep=y_name y) out=e_y;
id y_name;
run;

/*This puts the engineer data into 1 record in a single table*/
data engineers;
merge e_y e_x;
drop _name_;
run;

proc sql noprint;
create table want as
select a.*,
       b.*
    from tasks a,
         engineers b;
quit;

The last step is to calculate the distances. We can use DATA STEP arrays to help with the processing.

data want;
set want;
array E_[&n];
array X_E_[&n];
array Y_E_[&n];

do i=1 to &n;
    E_[i] = geodist(x,y,x_e_[i],y_e_[i]);
end;

drop X: Y: i;
run;
DomPazz
  • 12,415
  • 17
  • 23