I have a data.frame with three varibles (genes, samples,copy_number) which i want to make summary statistics of copy_number grouped by gene names.
I tried using the summarise function in dplyr but keeps failing.
I want number of samples with greater or less a specific corrected_copy_number
data looks like this (truncated)
> sub.melt.df.annotations.cna.genes
Gene_Names sample corrected_copy_number
3234 BRCA1 sample1 6
7317 BRCA2 sample1 1
10500 ERBB2 sample1 4
11258 GATA3 sample1 3
3234 GATA3 sample2 2
7317 BRCA2 sample2 1
10500 ERBB2 sample2 3
.
.
11258 GeneX sampleN #
> sub.melt.df.annotations.cna.genes %>% group_by(Gene_Names) %>% dplyr::summarise(count=n(), min(corrected_copy_number),gain=n((corrected_copy_number>2)))
Error: Problem with `summarise()` input `gain`.
x unused argument ((corrected_copy_number > 2))
ℹ Input `gain` is `n((corrected_copy_number > 2))`.
ℹ The error occurred in group 1: Gene_Names = "BRCA1".
Run `rlang::last_error()` to see where the error occurred.
thanks for your help