I am trying to clean up data for social network analysis, and as a newcomer to coding, I'm having trouble writing a complex conditional.
First, we have dataframe bookinfo
where the headers of interest are Date, Receiver, bookID:
>head(bookinfo)
date receiver bookId readingStatus
1 2017-04-21 03cff9d7-5712-410c-a4bf-f04ceede644b asin:0062228013 ALREADY_READ
2 2017-04-18 03cff9d7-5712-410c-a4bf-f04ceede644b asin:1442449616 ALREADY_READ
3 2017-04-24 03cff9d7-5712-410c-a4bf-f04ceede644b asin:0545851904 ALREADY_READ
4 2017-04-18 03cff9d7-5712-410c-a4bf-f04ceede644b asin:0545384176 ALREADY_READ
5 2017-06-02 03cff9d7-5712-410c-a4bf-f04ceede644b asin:0763643491 ALREADY_READ
6 2017-04-24 03cff9d7-5712-410c-a4bf-f04ceede644b asin:0545851890 ALREADY_READ
Then, we have dataframe rec
where the headers of interest are Date, Sender, Receiver, and bookId:
>head(rec)
date sender receiver messageType bookId
1 4/21/17 7a28156e-950e-47b7-a4aa-241fa9cfcf1a f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:0986444138
2 4/21/17 fb4eefd3-03e9-40c3-bc9e-af85ea88d827 f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:1434297314
3 4/21/17 dc319e95-0e3e-461e-b02c-abab4414c741 f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:1484746694
4 4/18/17 118c57b6-e946-453f-88b2-6ae1282e62ab f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:1514241587
5 4/21/17 dd0de21d-889d-4bf1-9ebb-af50b6660815 f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:0986444138
6 4/21/17 f85d06ea-d534-42de-a714-6dc6358d1e29 f8b027a3-89eb-475a-83e0-eb94e24eaab4 RECOMMENDS_A_BOOK asin:1484746694
In the dataframe rec
, I want to create a new column Ties. The conditional would be as follows:
Tie = 1 if
- In
rec
: Sender, Receiver, and bookId are in the same row AND - In
bookinfo
: that same Receiver, same bookId are in the same row AND the date here is later than the date of the referenced row inrec
- Note that
rec
andbookinfo
are not necessarily consistent. Whereas Sender+Receiver+bookId may be row 3 inrec
, Receiver+bookId may be row 10 inbookinfo
.
Otherwise, Tie=0.
The intuition is that if the Receiver shows activity with the book AFTER the date of receiving a recommendation of that book from the Sender, then they have a tie. (If they have show activity before the date, it's unrelated to the Sender).
Thanks in advance for any help and for your time!