0

I have the following data:

client_id <- c(1,2,3,1,2,3)
product_id <- c(10,10,10,20,20,20)
connected <- c(1,1,0,1,0,0)
clientID_productID <- paste0(client_id,";",product_id) 
df <- data.frame(client_id, product_id,connected,clientID_productID)

  client_id product_id connected clientID_productID
1         1         10         1               1;10
2         2         10         1               2;10
3         3         10         0               3;10
4         1         20         1               1;20
5         2         20         0               2;20
6         3         20         0               3;20

The goal is to produce a relational matrix:

  client_id product_id clientID_productID client_pro_1_10 client_pro_2_10 client_pro_3_10 client_pro_1_20 client_pro_2_20 client_pro_3_20
1         1         10               1;10               0               1               0               0               0               0
2         2         10               2;10               1               0               0               0               0               0
3         3         10               3;10               0               0               0               0               0               0
4         1         20               1;20               0               0               0               0               0               0
5         2         20               2;20               0               0               0               0               0               0
6         3         20               3;20               0               0               0               0               0               0

In other words, when product_id equals 10, clients 1 and 2 are connected. Importantly, I do not want client 1 to be connected with herself. When product_id=20, I have only one client, meaning that there is no connection, so I should have only zeros.

To be more specific, all that I am trying to create is a square matrix of relations, with all the combinations of client/product in the columns. A client can only be connected with another if they bought the same product.

I have searched a bunch and played with other code. The difference between this problem and others already answered is that I want to keep on my table client number 3, even though she never bought any product. I want to show that she does not have a relationship with any other client. Right now, I am able to create the matrix by stacking the relationships by product (How to create relational matrix in R?), but I am struggling with a way to not stack them.

I apologize if the question is not specific enough, or too specific. Thank you anyway, stackoverflow is a lifesaver for beginners.

Miranda
  • 148
  • 13
  • Right now you are just asking for us to rewrite a textbook/manual with a bespoke tutorial & do your (home)work & you have shown no research or other effort. Dumps of requirements are not on-topic questions. Please see [ask], hits googling 'stackexchange homework' & the voting arrow mouseover texts. Show what relevant parts you can do & explain re the first place you are stuck. Please in code questions give a [mre]. That includes clear specification & explanation & the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) – philipxy Oct 20 '19 at 00:26

1 Answers1

1

I believe I figured it out.

It is for sure not the most elegant answer, though.

client_id <- c(1,2,3,1,2,3)
product_id <- c(10,10,10,20,20,20)
connected <- c(1,1,0,1,0,0)
clientID_productID <- paste0(client_id,";",product_id) 
df <- data.frame(client_id, product_id,connected,clientID_productID)

df2 <- inner_join(df[c(1:3)], df[c(1:3)], by = c("product_id", "connected"))

df2$Source <- paste0(df2$client_id.x,"|",df2$product_id)
df2$Target <- paste0(df2$client_id.y,"|",df2$product_id)
df2 <- df2[order(df2$product_id),]

indices = unique(as.character(df2$Source))

mtx <- as.matrix(dcast(df2, Source ~ Target, value.var="connected", fill=0))
rownames(mtx) = mtx[,"Source"]
mtx <- mtx[,-1]
diag(mtx)=0

mtx = as.data.frame(mtx)
mtx = mtx[indices, indices]

I got the result I wanted:

     1|10 2|10 3|10 1|20 2|20 3|20
1|10    0    1    0    0    0    0
2|10    1    0    0    0    0    0
3|10    0    0    0    0    0    0
1|20    0    0    0    0    0    0
2|20    0    0    0    0    0    0
3|20    0    0    0    0    0    0
Miranda
  • 148
  • 13