Aggregate, dcast and create new columns in R

Question

I have a data frame for every second. From the data frame with 1 second interval, I managed to aggregate the data into 1-minute interval using the following code:

agg_cont <- df %>% group_by(Date, Hour, Minute, Status, Mean) %>% count(name = 'Occurrence')

Now I have a data frame as shown below,

Date	Minute	Status	Mean	Occurrence
12/01/2022	00	a	20	60
12/02/2022	01	b	32	60
12/01/2022	02	a	21	60
12/02/2022	03	a	12	60
12/01/2022	04	a	23	20
12/01/2022	04	b	43	40
12/01/2022	05	a	33	60

Please note that the column 'Occurrence' denotes the number of seconds the status occurred in the specific minute. For the minute '04', if the occurrence for status 'a' and 'b' is 20 and 40, respectively, then status 'a' occurred for 20 sec in the particular minute.

With the above dataframe, I would like to have only one row for each minute and create new columns for each 'status' and the mean value that occurred in that particular minute.

Desired output:

Date	Minute	a	b	Mean 'a'	Mean'b'
12/01/2022	00	60	0	20	NA
12/02/2022	01	0	60	32	NA
12/01/2022	02	60	0	21	NA
12/01/2022	03	0	60	12	NA
12/01/2022	04	20	40	23	43
12/02/2022	05	60	0	33	NA

I am trying to use the dcast function to get the desired output.

Thanks

Why are the values in two different days 12/01/2022 and 12/02/2022 but same minute 04 in the same output row? — Rui Barradas, Mar 27 '23 at 20:04

Rui Barradas · Accepted Answer · 2023-03-27T20:09:30.430

Here is a tidyverse way.

agg_count <- read.table(text = "
Date    Hour    Minute  Status  Mean    Occurrence
12/01/2022  00  00  a   20  60
12/02/2022  00  01  b   32  60
12/01/2022  00  02  a   21  60
12/02/2022  00  03  a   12  60
12/01/2022  00  04  a   23  20
12/02/2022  00  04  b   43  40
12/02/2022  00  05  a   33  60
", header = TRUE)

suppressPackageStartupMessages({
  library(dplyr)
  library(tidyr)
})

agg_count %>%
  pivot_wider(
    id_cols = c(Date, Hour, Minute),
    names_from = Status,
    values_from = c(Occurrence, Mean),
    values_fill = 0L
  )
#> # A tibble: 7 × 7
#>   Date        Hour Minute Occurrence_a Occurrence_b Mean_a Mean_b
#>   <chr>      <int>  <int>        <int>        <int>  <int>  <int>
#> 1 12/01/2022     0      0           60            0     20      0
#> 2 12/02/2022     0      1            0           60      0     32
#> 3 12/01/2022     0      2           60            0     21      0
#> 4 12/02/2022     0      3           60            0     12      0
#> 5 12/01/2022     0      4           20            0     23      0
#> 6 12/02/2022     0      4            0           40      0     43
#> 7 12/02/2022     0      5           60            0     33      0

^{Created on 2023-03-27 with reprex v2.0.2}

As I asked in comment to the question, if the date is not to be considered, then the code below will put input rows with equal minutes in the same output rows.

agg_count %>%
  pivot_wider(
    id_cols = c(Hour, Minute),
    names_from = Status,
    values_from = c(Occurrence, Mean),
    values_fill = 0L
  )
#> # A tibble: 6 × 6
#>    Hour Minute Occurrence_a Occurrence_b Mean_a Mean_b
#>   <int>  <int>        <int>        <int>  <int>  <int>
#> 1     0      0           60            0     20      0
#> 2     0      1            0           60      0     32
#> 3     0      2           60            0     21      0
#> 4     0      3           60            0     12      0
#> 5     0      4           20           40     23     43
#> 6     0      5           60            0     33      0

^{Created on 2023-03-27 with reprex v2.0.2}

Many thanks for your quick help. I am getting the following error "Error in `pivot_wider_spec()`: ! Can't convert `fill` to . " When I tried your example, it works perfectly. But upon applying the code to my dataset, I get the above error — Karthik, Mar 27 '23 at 20:20
I added "values_fn=mean" to the code you suggested and got the results. Many thanks for your help. — Karthik, Mar 27 '23 at 20:54

Karthik · Answer 2 · 2023-03-28T14:41:28.670

0

Updated answer:

 agg_count %>%
   pivot_wider(
     id_cols = c(Date, Hour, Minute),
     names_from = Status,
     values_from = c(Occurrence, Mean),
     values_fn = sum,
     values_fill = 0L  
   )

edited Mar 28 '23 at 14:41

answered Mar 27 '23 at 21:01

Karthik

117
7

Aggregate, dcast and create new columns in R

2 Answers2