5

I have searched the internet, but I haven't been able to find a solution to my problem. I have a data frame of numbers and characters:

mydf <- data.frame(col1=c(1, 2, 3, 4), 
                   col2 = c(5, 6, 7, 8), 
                   col3 = c("a", "b", "c", "d"), stringsAsFactors  = FALSE)

mydf:

col1 col2 col3
  1    5   a
  2    6   b
  3    7   c
  4    8   d

I would like to repeat this into

col1 col2 col3
  1   5    a
  1   5    a
  1   5    a
  2   6    b
  2   6    b
  2   6    b
  3   7    c
  3   7    c
  3   7    c
  4   8    d
  4   8    d
  4   8    d

Using apply(mydf, 2, function(x) rep(x, each = 3)) will give the right repetition, but will not conserve the classes of col1, col2, and col3, as numeric, numeric and character, respectively, as I would like. This is a constructed example, and setting the classes of each column in my data frame is a bit tedious.

Is there a way to make the repetition while conserving the classes?

Sisse
  • 291
  • 1
  • 4
  • 14

6 Answers6

11

It's even easier than you think.

index <- rep(seq_len(nrow(mydf)), each = 3)
mydf[index, ]

This also avoids the implicit looping from apply.

Richie Cotton
  • 118,240
  • 47
  • 247
  • 360
4

This is an unfortunate and an unexpected class conversion (too me, anyway). Here's an easy workaround that uses the fact that a data.frame is just a special list.

data.frame(lapply(mydf, function(x) rep(x, each = 3)))

(anyone know why the behaviour the questioner observed shouldn't be reported as a bug?)

John
  • 23,360
  • 7
  • 57
  • 83
2

Just another solution:

mydf3 <- do.call(rbind, rep(list(mydf), 3))
Wojciech Sobala
  • 7,431
  • 2
  • 21
  • 27
1

Take a look at aggregate and disaggregate in the raster package. Or, use my modified version zexpand below:

# zexpand: analogous to disaggregate

zexpand<-function(inarray, fact=2, interp=FALSE,  ...)  {
# do same analysis of fact to allow one or two values, fact >=1 required, etc.
fact<-as.integer(round(fact))
switch(as.character(length(fact)),
            '1' = xfact<-yfact<-fact,
            '2'= {xfact<-fact[1]; yfact<-fact[2]},
            {xfact<-fact[1]; yfact<-fact[2];warning(' fact is too long. First two values used.')})
if (xfact < 1) { stop('fact[1] must be > 0') } 
if (yfact < 1) { stop('fact[2] must be > 0') }

bigtmp <- matrix(rep(t(inarray), each=xfact), nrow(inarray), ncol(inarray)*xfact, byr=T)  #does column expansion
bigx <- t(matrix(rep((bigtmp),each=yfact),ncol(bigtmp),nrow(bigtmp)*yfact,byr=T))
# the interpolation would go here. Or use interp.loess on output (won't
# handle complex data). Also, look at fields::Tps which probably does
# a much better job anyway.  Just do separately on Re and Im data
return(invisible(bigx))
}
Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
1

I really like Richie Cotton's answer.

But you could also simply use rbind and reorder it.

res <-rbind(mydf,mydf,mydf)
res[order(res[,1],res[,2],res[,3]),]
Pierre Lapointe
  • 16,017
  • 2
  • 43
  • 56
0

The package mefa comes with a nice wrapper for rep applied to data.frame. This will match your example in one line:

mefa:::rep.data.frame(mydf, each=3)
dardisco
  • 5,086
  • 2
  • 39
  • 54