-1

It is often said that 80% of data analysis is spent on the cleaning and preparing data.

In this aspect I sometimes spent quite some time to shape my data and finally get a data.frame as shown in reproducible example.

I want to organize rows as columns in my data and order them one of the unique value .

you can assume No is a key column, that will determine the number of columns in the transposed data frame

Here is a reproducible example;

set.seed(1)
No <- rep(seq(0,95,1),times=3)
AC <- rep(rep(c(78,110),each=1),times=length(No)/2)
AR <- rep(rep(c(256,320,384),each=2),times=length(No)/6)
AM <- rep(1,times=length(No))
DQ <- rep(rep(seq(0,15,1),each=6),times=3)
V <- rep(seq(100,300,100),each=96)
R <- rep(sample(5000:6000,96),times=3)

df <- data.frame(No,AC,AR,AM,DQ,V,R)
head(df)
  No  AC  AR AM DQ   V    R
1  0  78 256  1  0 100 5455
2  1 110 256  1  0 100 5410
3  2  78 320  1  0 100 5810
4  3 110 320  1  0 100 5603
5  4  78 384  1  0 100 5652
6  5 110 384  1  0 100 5351

The output which I`m looking for;

#  No        0     1        2      3     4      5     6     7    8     9      10     11      *    *   *  95  
#  AC       78     110     78     110    78    110   78    110  78    110     78    110      *    *   * 110
#  AR      256     256     320    320    384   384  256    256  320   320    384    384      *    *   * 384
#  AM        1      1       1      1      1     1    1      1    1     1       1      1      *    *   *   1
#  DQ        0      0       0      0      0     0    1      1    1     1       1      1      *    *   *  15 
#   V       
#  100     5265   5372    5572    5906   5201  5894  5939  5656  5624 5061  5204    5174     *    *   * 5722
#  200     5256   5720    5162    5481   5977  5711  5600  5752  5042 5885  5759    5467     *    *   * 5722
#  300     5265   5372    5572    5906   5201  5894  5939  5656  5624 5061  5204    5174     *    *   * 5722  
Alexander
  • 4,527
  • 5
  • 51
  • 98
  • What could we assume is a key column, that will determine the number of columns in the transposed data frame? The unique values of `No`? – Simon O'Hanlon Jul 16 '15 at 08:02
  • @aoron barlow Just noted your comment before Simon deleted his answer: "_In the end I needed to plot those R values vs V_". Please explain clearly what you wish to achieve. Right now your question may sound like an [XY problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). Also, make your example _minimal_ - much easier to play around with. – Henrik Jul 16 '15 at 08:03
  • @aoronbarlow it is unclear how you want to go to your desired output, and where R1-R95 should come from. For instance, in the head of the sampledata, R is 5455 with V=100 and No=0. However, this R-value is nowhere in your desired output under No = 0. What exactly are you trying to achieve? – Heroka Jul 16 '15 at 08:09
  • @Heroka you can think R1-R95 columns are empty. I just want to mention that is the output which I look for. I'm a sure there is function to do it but I don`t know it right now. – Alexander Jul 16 '15 at 08:15
  • The numbers in `5256 5720 5162 , ...`, appear nowhere in your input, but you would like to see them in the output. From what I can tell, everything except that row is just `df %>% group_by(No) %>% mutate(R100 = R[V == 100], R200 = R[V == 200], R300 = R[V == 300], R = NULL, V = NULL) %>% unique %>% t`. – shadow Jul 16 '15 at 08:49
  • @shadow your solution is elegant. thanks. But how can we generalize this solution for large number of `V` (for instance V=seq(100,2100,100)) and remove R values right side corner of V values. lastly, can you submit your solution as an answer. Thanks! – Alexander Jul 16 '15 at 11:36

1 Answers1

2

Are you just looking for spread from tidyr? Then you can of course transpose with t.

t(spread(df, V, R))
##        1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16   17 
## No     0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16
## AC    78  110   78  110   78  110   78  110   78  110   78  110   78  110   78  110   78
## AR   256  256  320  320  384  384  256  256  320  320  384  384  256  256  320  320  384
## AM     1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1    1
## DQ     0    0    0    0    0    0    1    1    1    1    1    1    2    2    2    2    2
## 100 5265 5372 5572 5906 5201 5894 5939 5656 5624 5061 5204 5174 5679 5379 5759 5490 5706
## 200 5265 5372 5572 5906 5201 5894 5939 5656 5624 5061 5204 5174 5679 5379 5759 5490 5706
## 300 5265 5372 5572 5906 5201 5894 5939 5656 5624 5061 5204 5174 5679 5379 5759 5490 5706
shadow
  • 21,823
  • 4
  • 63
  • 77