I have a dataframe, p4p5
, that contains the following columns:
p4p5 <- c("SampleID", "expr", "Gene", "Period", "Consequence", "isPTV")
I've used the aggregate
function here to find the median expression per Gene:
p4p5_med <- aggregate(expr ~ Gene, p4p5, median)
However, this results in a dataframe with the columns "expr" and "Gene" only. How can I still retain all the original columns when applying the aggregate function?
UPDATE:
Input (p4p5
):
SampleID expr Gene Period Consequence isPTV
HSB430 -1.23 ENSG000098 4 upstream_gene_variant 0
HSB321 -0.02 ENSG000098 5 stop_gained 1
HSB296 3.12 ENSG000027 4 upstream_gene_variant 0
HSB201 1.22 ENSG000027 4 intron_variant 0
HSB220 0.13 ENSG000013 6 intron_variant 0
Expected output:
SampleID expr Gene Period Consequence isPTV Median
HSB430 -1.23 ENSG000098 4 upstream_gene_variant 0 -0.625
HSB321 -0.02 ENSG000098 5 stop_gained 1 -0.625
HSB296 3.12 ENSG000027 4 upstream_gene_variant 0 2.17
HSB201 1.22 ENSG000027 4 intron_variant 0 2.17
HSB220 0.13 ENSG000013 6 intron_variant 0 0.13