I have a dataframe of about 2000 rows and 3 columns. In essence, I want to reshape this dataframe to be wider than longer. This is an example of my current data:
ID | Procedure | Date |
---|---|---|
D55 | Sedation | 01/01/2001 |
D55 | Excision | 01/01/2001 |
D55 | Biopsy | 01/01/2001 |
A66 | Sedation | 02/02/2001 |
A66 | Excision | 02/02/2001 |
T44 | Sedation | 03/03/2001 |
T44 | Biopsy | 03/03/2001 |
T44 | Sedation | 04/04/2001 |
T44 | Excision | 04/04/2001 |
G88 | Sedation | 05/05/2001 |
G88 | Biopsy | 05/05/2001 |
G88 | Sedation | 06/06/2001 |
G88 | Excision | 06/06/2001 |
G88 | Sedation | 07/07/2001 |
G88 | Re-excision | 07/07/2001 |
I want the each row to be one line for the ID, so I'd want to create something like this:
ID | Date 1 | Procedure(s) | Date 2 | Procedure(s) | Date 3 | Procedure(s) |
---|---|---|---|---|---|---|
D55 | 01/01/2001 | Sedation, Excision, Biopsy | ||||
A66 | 02/02/2001 | Sedation, Excision | ||||
T44 | 03/03/2001 | Sedation, Biopsy | 04/04/2001 | Sedation, Excision | ||
G88 | 05/05/2001 | Sedation, Biopsy | 06/06/2001 | Sedation, Excision | 07/07/2001 | Sedation, Re-excision |
The majority of IDs all have the same date, but different procedures documented. There are a handful that came in for further procedures on subsequent dates. I can't see any that came in for more than 3 different dates, but a way to count the dates documented per ID would be useful.
I've tried using cast and dcast so far, but I'm not really getting anywhere. I'm very new to R, so any help would be greatly appreciated! Thanks for reading.