0

I am new to R. I need to reformat the following data frame:

   `Sample Name` `Target Name`       'CT values'
           <chr>         <chr>    <dbl>
 1      Sample 1         actin  19.69928
 2      Sample 1          Ho-1  27.71864
 3      Sample 1         Nrf-2  26.00012
 9      Sample 9          Ho-1  25.31180
10      Sample 9         Nrf-2  26.41421
11      Sample 9            C3  26.16980
...
15      Sample 1         actin  19.49202

Actually, I want to have the different 'Target Names' as column names, and the individual 'Sample Names' as row names. The table should then display the respective CT values. But note that there are duplicates, e.g., Sample 1 exists twice, as the corresponding Target name, e.g. "actin" does. What I want to have is that the table later only shows these duplicates once, with the means of the two different CT values.

I guess this is a very basic R data frame manipulation, but as I said, I am quite new to R and messing around with different tutorials.

Thank you very much in advance!

ic23oluk
  • 125
  • 1
  • 9

1 Answers1

1

One way of doing that using the tidyverse ecosystem of packages:

library(tidyverse)

tab <- tribble(
  ~`Sample Name`, ~`Target Name`, ~ `CT values`,
  "Sample 1",       "actin",  19.69928,
  "Sample 1",       "Ho-1",  27.71864,
  "Sample 1",       "Nrf-2",  26.00012,
  "Sample 9",       "Ho-1",  25.31180,
  "Sample 9",       "Nrf-2",  26.41421,
  "Sample 9",       "C3",  26.16980,
  "Sample 1",       "actin",  19.49202
)

tab %>%
  # calculate the mean of your dpulicate
  group_by(`Sample Name`, `Target Name`) %>%
  summarise(`CT values` = mean(`CT values`)) %>%
  # reshape the data
  spread(`Target Name`, `CT values`)
#> # A tibble: 2 x 5
#> # Groups: Sample Name [2]
#>   `Sample Name` actin    C3 `Ho-1` `Nrf-2`
#> * <chr>         <dbl> <dbl>  <dbl>   <dbl>
#> 1 Sample 1       19.6  NA     27.7    26.0
#> 2 Sample 9       NA    26.2   25.3    26.4

you can also use data.table to a more consise way of doing this with dcast reshape function

library(data.table)
#> 
#> Attachement du package : 'data.table'
#> The following objects are masked from 'package:dplyr':
#> 
#>     between, first, last
#> The following object is masked from 'package:purrr':
#> 
#>     transpose
setDT(tab)
dcast(tab, `Sample Name` ~ `Target Name`, fun.aggregate = mean)
#> Using 'CT values' as value column. Use 'value.var' to override
#>    Sample Name      C3     Ho-1    Nrf-2    actin
#> 1:    Sample 1     NaN 27.71864 26.00012 19.59565
#> 2:    Sample 9 26.1698 25.31180 26.41421      NaN

Created on 2018-01-13 by the reprex package (v0.1.1.9000).

cderv
  • 6,272
  • 1
  • 21
  • 31