3

Say I have a data frame where one column is some repeating value (dates, IDs, etc). Is there a way to convert a data frame into a now data frame with columns instead of replicating rows? Basically I want to transpose something like this:

col1 col2 col3
1    aa   30
2    aa   40
3    aa   10
1    bb   20
2    bb   12
3    bb   15
1    cc   40
2    cc   31
3    cc   12

Into this:

    aa   bb   cc
1   30   20   40
2   40   12   31
3   10   15   12

Here is some code that makes a sample of the first data frame:

a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)

I am unsure how to transpose it.

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
Jeffrey Kramer
  • 1,345
  • 6
  • 25
  • 43

5 Answers5

6
a <- c(rep(1:10, 3))
b <- c(rep("aa", 10), rep("bb", 10), rep("cc", 10))
set.seed(123)
c <- sample(seq(from = 20, to = 50, by = 5), size = 30, replace = TRUE)
d <- data.frame(a,b, c)
#how to transpose it#
e<-reshape(d,idvar='a',timevar='b',direction='wide')
e
Jason
  • 1,559
  • 1
  • 9
  • 14
6

This is also a case in which you can use unstack:

unstack(d, c ~ b)
#    aa bb cc
# 1  30 50 50
# 2  45 35 40
# 3  30 40 40
# 4  50 40 50
# 5  50 20 40
# 6  20 50 40
# 7  35 25 35
# 8  50 20 40
# 9  35 30 30
# 10 35 50 25
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
5

Using your data frame d,

library(tidyr)
> spread(d, key = b, value = c)
    a aa bb cc
1   1 30 50 50
2   2 45 35 40
3   3 30 40 40
4   4 50 40 50
5   5 50 20 40
6   6 20 50 40
7   7 35 25 35
8   8 50 20 40
9   9 35 30 30
10 10 35 50 25

Explanation, the argument key = b lets you specify a column in your data frame. spread will create a new column for each unique entry in the key column b. The argument value = c tells spread to retrieve the value in column c and write it in the corresponding new key column.

Alex
  • 15,186
  • 15
  • 73
  • 127
4

If there are always equal numbers of observations in each group, this would be very easy with split then as.data.frame

as.data.frame(split(d$c, d$b))
#    aa bb cc
# 1  30 50 50
# 2  45 35 40
# 3  30 40 40
# 4  50 40 50
# 5  50 20 40
# 6  20 50 40
# 7  35 25 35
# 8  50 20 40
# 9  35 30 30
# 10 35 50 25
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
0

With split and cbind:

> ll = lapply(split(d, d$b), function(x) x[3])
> dd = do.call(cbind, ll)
> names(dd) = names(ll)
> dd
   aa bb cc
1  30 50 50
2  45 35 40
3  30 40 40
4  50 40 50
5  50 20 40
6  20 50 40
7  35 25 35
8  50 20 40
9  35 30 30
10 35 50 25
rnso
  • 23,686
  • 25
  • 112
  • 234