I am trying to spread a data frame, but I am not quite familiar with spread()
and gather()
.
Below is a sample of my data. It has 9 rows all with the same Application.Number
. I would like to end up with one row per Application.Number-Decicion
combination. The remaining variables date_generated
date_decided
time_to_decision
and text
have to be repeated for each Application.Number-Decicion
combination or the last one should be taken. The data is already sorted by Application.Number
and date_generated
.
structure(list(Application.Number = c(80749L, 80749L, 80749L,
80749L, 80749L, 80749L, 80749L, 80749L, 80749L), Decision = c("Invalid",
"Invalid", "Invalid", "Invalid", "Invalid", "Invalid", "Approved",
"Approved", "Approved"), date_generated = structure(c(1521810060,
1521810060, 1523523840, 1523536500, 1524036720, 1524136380, 1524137460,
1524137460, 1524137460), class = c("POSIXct", "POSIXt"), tzone = ""),
date_decided = structure(c(1522155960, 1522155660, 1523534400,
1523600520, 1524127140, 1524136740, 1524211800, 1524211740,
1524211200), class = c("POSIXct", "POSIXt"), tzone = ""),
time_to_decision = c(4.00347222222222, 4, 0.122222222222222,
0.740972222222222, 1.04652777777778, 0.00416666666666667,
0.860416666666667, 0.859722222222222, 0.853472222222222),
text = c("rIUQRmOkyZ", "ZxdYUr16NR", "8IIipoleOV", "nLuIgToxcT",
"xYFksrws87", "N2oECMtgQo", "RKcrBcBFI2", "jaH438byVt", "80ggA2hZr7"
)), row.names = 15880:15888, class = "data.frame")
EDIT: Decided that the output should be just one row and all rows should pivot around Application.Number
.
I ended up making a separate data frame with the duplicates and joining it back to the unique rows.
There must be a better way to do it.