0

I have a data and I want to plot cumulative distribution of this. My data is:

 dput(gene_snp_distance[1:20, c(1:3)])
structure(list(distance = c(1000, 2000, 3000, 4000, 5000, 6000, 
7000, 8000, 9000, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 
17000, 18000, 19000, 20000), all_snps = c(10, 11.8, 13.6, 15.4, 
17.2, 19, 20.8, 22.6, 24.4, 26.2, 28, 29.8, 31.6, 33.4, 35.2, 
37, 38.8, 40.6, 42.4, 44.2), gtex_snps = c(12, 14.3, 16.6, 18.9, 
21.2, 23.5, 25.8, 28.1, 30.4, 32.7, 35, 37.3, 39.6, 41.9, 44.2, 
46.5, 48.8, 51.1, 53.4, 56.2)), row.names = c(NA, 20L), class = "data.frame")

And this is what I tried:

plot(gene_snp_distance$distance, gene_snp_distance$all_snps)

But I want to plot a graph like this: wherein the first blue plot represent all_Snps and red represents gtex_Snps. Does anyone know how to plot this similar graph.

enter image description here My plotted graph look like this: enter image description here But I want it to represent a bar for each distance and do in one graph for both all_snps and gtex_snps like a bar graph on top of each other.

Does anyone know how to plot a similar graph like this one. Thank you.

rheabedi1
  • 65
  • 7

1 Answers1

1

Not sure if this is what you want. Since all_Snps is not defined in your data, I can only guess that you would like to plot the cumulative sum of gtex_snps next to gtex_snps itself.

library(tidyverse)

p <- gene_snp_distance |> 
  pivot_longer(cols = matches("snps"),
               names_to = "type", 
               values_to = "value") |> 
  ggplot(aes(distance, value, fill = type))

p + geom_col()

p + geom_col(position = "dodge")

Created on 2023-04-02 with reprex v2.0.2

where

gene_snp_distance <- structure(list(distance = c(1000, 2000, 3000, 4000, 5000, 6000, 
                                  7000, 8000, 9000, 10000, 11000, 12000, 13000, 14000, 15000, 16000, 
                                  17000, 18000, 19000, 20000), all_snps = c(10, 11.8, 13.6, 15.4, 
                                                                            17.2, 19, 20.8, 22.6, 24.4, 26.2, 28, 29.8, 31.6, 33.4, 35.2, 
                                                                            37, 38.8, 40.6, 42.4, 44.2), gtex_snps = c(12, 14.3, 16.6, 18.9, 
                                                                                                                       21.2, 23.5, 25.8, 28.1, 30.4, 32.7, 35, 37.3, 39.6, 41.9, 44.2, 
                                                                                                                       46.5, 48.8, 51.1, 53.4, 56.2)), row.names = c(NA, 20L), class = "data.frame")
dufei
  • 2,166
  • 1
  • 7
  • 18
  • I tried this but it gives an error Error in UseMethod("mutate") : no applicable method for 'mutate' applied to an object of class "function" – rheabedi1 Apr 02 '23 at 18:15
  • I am sorry. I have edited my post to include all_snps in dput value. I want to first plot bar graph of all snps and on top or side of that gtex_snps – rheabedi1 Apr 02 '23 at 18:23
  • I am referring to your data as `df` in my code. So you will have to either rename your dataset accordingly or replace `df` from my code with the name of the object that holds your data. – dufei Apr 02 '23 at 19:36
  • I updated my answer to reflect your naming conventions (the dataset is now called `gene_snp_distance`) and the new column in your data. I also include two options, either a stacked or a dodged bar chart. Does this do what you want? – dufei Apr 02 '23 at 19:40
  • hello, yes, this one works. I want to ask if you don't mind can you explain the part of code with respect to p. I don't understand it. I am not very familiar with tidyverse and trying to learn R. – rheabedi1 Apr 03 '23 at 00:34