How to add certain number of rows based on values on the rows of another variable

Question

date        time    td  number
20150102    80000   -1  0
20150102    80001   -1  2
20150102    80002   1   0
20150102    80003   1   3
20150102    80004   -1  0

I need to create append number of rows based on variable "number". And let date and time be the same as the numbered rows, while the variable td=0. I want the data like this:

date        time    td  number
20150102    80000   -1  0
20150102    80001   -1  2
20150102    80002   1   0
20150102    80003   1   3
20150102    80004   -1  0
20150102    80001   0   NA 
20150102    80001   0   NA
20150102    80003   0   NA
20150102    80003   0   NA
20150102    80003   0   NA

I need to run a loop through this, because I have over 20,000 obs — vera_kkk, Feb 22 '18 at 21:32
I don't understand your expected outcome. Can you clarify the rules? Why does `time` change for first 5 rows? Why the values in the new rows? — Maurits Evers, Feb 22 '18 at 21:36
Sorry! I original time should increase by 1, please see my edited version — vera_kkk, Feb 22 '18 at 21:40

score 2 · Accepted Answer · answered Feb 22 '18 at 21:48

2

I'd generate each column, then bind them into a data frame, then bind them to the original dataframe! No looping required.

Assuming your data frame is called df

#Create the date and time using the number column directly.
date <- rep(df$date, times = df$number)
time <- rep(df$time, times = df$number)

#Combine these fields into a data frame and set td to all 0s and number to all NAs
appenddf <- data.frame(date = date, time = time, td = 0, number = NA)

#Bind the data for appending to the original data frame
df <- rbind(df, appenddf)

answered Feb 22 '18 at 21:48

LachlanO

1,152
8
14

Good answer! I was thinking on same line but you were fast. +1 – MKR Feb 22 '18 at 22:19
Thanks! I suspect there might be a stronger answer which doesn't rely on the creation of a separate data frame, but I don't know it off the top of my head :) – LachlanO Feb 22 '18 at 22:20

David Foster · Answer 2 · 2018-02-22T22:14:50.903

0

This loop will do it using rbind.fill (from plyr), for dataframe df:

for (i in length(df$n)){
  x = df$n[i]
  while (x > 0){
    df <- rbind.fill(df, df[i,1:2])
    x = x -1
    print(x)
  }
}
#Switch NA's in df$td column to 0
df$td[is.na(df$td)] <- 0

edited Feb 22 '18 at 22:14

answered Feb 22 '18 at 21:47

David Foster

447
4
16

score 0 · Answer 3 · answered Feb 22 '18 at 22:07

Another option could be achieved using expandRows and separate function. The expand rows will allow to replicate rows with combined values which later on can be separate out and added to original df.

library(splitstackshape)
library(dplyr)
df1 <- setDT(expandRows(df, "number"))[, newsamp := 
sprintf("%d-%d-%d-%d", date, time, 0, NA)][,newsamp] %>% as.data.frame() %>% 
  separate(1,c("date", "time", "td", "number"))

rbind(df, df1)

#Result
#       date  time td number
#1  20150102 80000 -1      0
#2  20150102 80001 -1      2
#3  20150102 80002  1      0
#4  20150102 80003  1      3
#5  20150102 80004 -1      0
#6  20150102 80001  0     NA
#7  20150102 80001  0     NA
#8  20150102 80003  0     NA
#9  20150102 80003  0     NA
#10 20150102 80003  0     NA

score 0 · Answer 4 · answered Feb 23 '18 at 08:47

> a=rep(1:nrow(dat),dat$number+1)
> transform(dat[c(a[!duplicated(a)],a[duplicated(a)]),-4],num=`length<-`(dat$number,length(a)))
        date  time td num
1   20150102 80000 -1   0
2   20150102 80001 -1   2
3   20150102 80002  1   0
4   20150102 80003  1   3
5   20150102 80004 -1   0
2.1 20150102 80001 -1  NA
2.2 20150102 80001 -1  NA
4.1 20150102 80003  1  NA
4.2 20150102 80003  1  NA
4.3 20150102 80003  1  NA

How to add certain number of rows based on values on the rows of another variable

4 Answers4