I have a data.frame that has a date index which is replicated in the first column and has a number of different columns and corresponding rows containing data. Because the rows containing data (which are indexed to the date) can vary depending on if corresponding data had been collected sometimes there can be a row (date) with many blanks and only values for several columns. I'd like to collapse these rows into an month and year.
I.e. A row can show data for yesterday but not today for all columns because it may not have been collected yet. Just looks aesthetically messy and would rather just say "June-2020" and collapse them to remove the NA's.
Here is the data.frame's dput:
structure(list(Date = structure(c(18292, 18320, 18321, 18351,
18352, 18382, 18413, 18427, 18428), tzone = "UTC", tclass = "Date", class = "Date"),
`M-o-M Change in Median Rent - AMH GA` = c(0, NA, 0, NA,
0, 0, 0, NA, 0), `Median Advertised Rent - AMH GA` = c(1695,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH GA` = c(438,
NA, 430.5, NA, 450, 458, 385, NA, 331), `M-o-M Change in Median Rent - AMH Charlotte` = c(0,
NA, 0, NA, 0, 0.0272727272727273, 0, NA, 0.0324483775811208
), `Median Advertised Rent - AMH Charlotte` = c(1650, NA,
1650, NA, 1650, 1695, 1695, NA, 1750), `Median of Daily Rental Listings Available - AMH Charlotte` = c(244,
NA, 257, NA, 256, 270, 227, NA, 220), `M-o-M Change in Median Rent - AMH Dallas` = c(0,
NA, 0, NA, 0, 0, 0.0306406685236769, NA, 0), `Median Advertised Rent - AMH Dallas` = c(1795,
NA, 1795, NA, 1795, 1795, 1850, NA, 1850), `Median of Daily Rental Listings Available - AMH Dallas` = c(148,
NA, 150, NA, 166, 152.5, 131, NA, 135), `M-o-M Change in Median Rent - AMH Houston` = c(0,
NA, 0.0272727272727273, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Houston` = c(1650,
NA, 1695, NA, 1695, 1695, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Houston` = c(222,
NA, 223, NA, 228, 237.5, 203, NA, 189), `M-o-M Change in Median Rent - AMH Jacksonville` = c(0,
0.00681818181818183, NA, NA, -0.00677200902934538, 0, 0.0272727272727273,
NA, 0), `Median Advertised Rent - AMH Jacksonville` = c(1650,
1661.25, NA, NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH Jacksonville` = c(164,
179.5, NA, NA, 188, 195.5, 185, NA, 174), `M-o-M Change in Median Rent - AMH NC` = c(0,
NA, 0.0344827586206897, NA, 0, 0, 0.0272727272727273, NA,
0), `Median Advertised Rent - AMH NC` = c(1595, NA, 1650,
NA, 1650, 1650, 1695, NA, 1695), `Median of Daily Rental Listings Available - AMH NC` = c(365,
NA, 387, NA, 405, 447, 344, NA, 323), `M-o-M Change in Median Rent - AMH NV` = c(0,
NA, 0, 0, NA, -0.0265486725663717, 0, NA, 0.0272727272727273
), `Median Advertised Rent - AMH NV` = c(1695, NA, 1695,
1695, NA, 1650, 1650, NA, 1695), `Median of Daily Rental Listings Available - AMH NV` = c(62,
NA, 59, 70, NA, 71, 63, NA, 58), `M-o-M Change in Median Rent - AMH Orlando` = c(0,
0.0112676056338028, NA, NA, 0, 0, 0, NA, 0.0306406685236769
), `Median Advertised Rent - AMH Orlando` = c(1775, 1795,
NA, NA, 1795, 1795, 1795, NA, 1850), `Median of Daily Rental Listings Available - AMH Orlando` = c(82,
91.5, NA, NA, 106, 119, 117, NA, 105), `M-o-M Change in Median Rent - AMH Phoenix` = c(0,
NA, 0.0216172938350681, NA, 0.0258620689655173, -0.0252100840336135,
0.0344827586206897, 0, NA), `Median Advertised Rent - AMH Phoenix` = c(1561.25,
NA, 1595, NA, 1636.25, 1595, 1650, 1650, NA), `Median of Daily Rental Listings Available - AMH Phoenix` = c(130,
NA, 127, NA, 129, 131, 97, 85, NA), `M-o-M Change in Median Rent - AMH Raleigh` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0.0344827586206897
), `Median Advertised Rent - AMH Raleigh` = c(1550, NA, 1595,
NA, 1595, 1595, 1595, NA, 1650), `Median of Daily Rental Listings Available - AMH Raleigh` = c(91,
NA, 104, NA, 114, 142, 90, NA, 81), `M-o-M Change in Median Rent - AMH SoFla` = c(0,
-0.00869565217391299, NA, NA, 0.0233918128654971, -0.0314285714285715,
0.0162241887905605, NA, 0.0159651669085632), `Median Advertised Rent - AMH SoFla` = c(1725,
1710, NA, NA, 1750, 1695, 1722.5, NA, 1750), `Median of Daily Rental Listings Available - AMH SoFla` = c(11,
14, NA, NA, 11, 10, 7, NA, 7), `M-o-M Change in Median Rent - AMH Winston-Salem/Greensboro` = c(0,
NA, 0.0290322580645161, NA, 0, 0, 0, NA, 0), `Median Advertised Rent - AMH Winston-Salem/Greensboro` = c(1550,
NA, 1595, NA, 1595, 1595, 1595, NA, 1595), `Median of Daily Rental Listings Available - AMH Winston-Salem/Greensboro` = c(66,
NA, 68, NA, 72, 73, 71, NA, 63)), class = "data.frame", row.names = c("2020-01-31",
"2020-02-28", "2020-02-29", "2020-03-30", "2020-03-31", "2020-04-30",
"2020-05-31", "2020-06-14", "2020-06-15"))
Here are examples of two approaches I've tried:
test <- AMH_final_Monthly3 %>% mutate(month= month(Date), year=year(Date))
test2 <- AMH_final_Monthly3 %>%
collapse_by("monthly") %>%
dplyr::group_by(Date, add = TRUE)
test3 <- as.yearmon(AMH_final_Monthly3)
Help is appreciated!