-1

I have a data.frame with three varibles (genes, samples,copy_number) which i want to make summary statistics of copy_number grouped by gene names.

I tried using the summarise function in dplyr but keeps failing.

I want number of samples with greater or less a specific corrected_copy_number

data looks like this (truncated)

> sub.melt.df.annotations.cna.genes
       Gene_Names          sample corrected_copy_number
3234        BRCA1 sample1                     6
7317        BRCA2 sample1                     1
10500       ERBB2 sample1                     4
11258       GATA3 sample1                     3
3234        GATA3 sample2                     2
7317        BRCA2 sample2                     1
10500       ERBB2 sample2                     3
.
.
11258       GeneX sampleN                     #



> sub.melt.df.annotations.cna.genes %>% group_by(Gene_Names) %>% dplyr::summarise(count=n(), min(corrected_copy_number),gain=n((corrected_copy_number>2)))
Error: Problem with `summarise()` input `gain`.
x unused argument ((corrected_copy_number > 2))
ℹ Input `gain` is `n((corrected_copy_number > 2))`.
ℹ The error occurred in group 1: Gene_Names = "BRCA1".
Run `rlang::last_error()` to see where the error occurred.

thanks for your help

sahuno
  • 321
  • 1
  • 8

1 Answers1

1

Replace n(corrected_copy_number>2) with sum(corrected_copy_number>2).

BellmanEqn
  • 791
  • 3
  • 11