2

Can someone help me ? How can I add my circos function in my data? when i try i get an error message and it doesn't produce any results I want to do a donut figure like the representation number 1 but i have one other resultat that we can see in the other picture. Trying to follow this guide

Desired Image

My data is lile this :

        Name;Integrons_1;Integrons_2;Integrons_3;TEM;SHV;CTX_M;NDM;OXA_48
        HZ1410;1;0;0;1;1;0;1;1
        HZ0411;1;0;0;1;0;0;0;1
        WI1410;1;0;0;1;1;0;0;1
        WI0411;1;0;0;1;0;0;1;0
        FR1410;1;0;0;1;0;0;0;0
        FR0411;0;0;0;1;1;0;0;0
        CN1410;1;0;0;1;1;0;1;0
        CN0411;1;0;0;1;0;0;0;1
        SA0912;0;0;0;0;0;0;0;0
        SA0302;1;0;0;1;0;0;0;0
        BL0912;1;0;0;1;1;0;0;0
        BL0302;1;0;0;1;1;1;0;1
        CSI0912;1;1;0;1;0;0;0;1
        CSI0302;1;0;0;1;1;1;1;1
        CSII0912;1;1;0;1;0;1;1;0
        CSII0302;1;0;0;1;0;0;0;0
        BGI0503;1;1;1;1;1;0;0;0
        BGI0610;1;0;0;1;1;0;0;0
        BGII0503;1;1;1;1;1;0;0;0
        BGII0610;1;0;1;1;1;0;0;0
        BCI0503;1;0;1;1;1;0;1;1
        BCI0610;1;0;1;1;0;0;0;0
        BCII0503;1;0;1;1;1;0;1;0
        BCII0610;1;0;1;1;0;0;0;0
        BDI0503;1;0;0;1;1;0;0;0
        BDI0610;1;1;1;1;1;1;1;0
        BDII0503;1;0;1;1;0;1;0;0
        BDII0610;1;1;1;1;1;1;0;1
        YPI0503;1;0;1;1;0;0;0;0
        YPI0710;1;1;1;1;1;1;0;1
        YPII0503;1;0;1;1;0;0;0;1
        YPII0710;1;1;1;1;1;1;0;0
        YMI0503;1;1;1;1;1;0;1;0
        YMI0710;1;1;1;1;1;1;1;1
        YMII0503;1;1;1;1;1;0;0;1
        YMII0710;1;1;1;1;1;1;1;1
        YSI0503;1;1;0;0;1;0;0;0
        YSI0710;1;1;1;1;1;1;1;1
        YSII0503;1;1;0;1;1;0;0;0
        YSII0710;0;1;1;1;1;0;0;0

And my code is represented here:

    #### IMPORT DES DONNEES ####
    nba = read.csv("essai_r.csv", sep=";", header=T)
    #attention : toujours utilise le format csv avec le sep = ";" pour avoir de 
    #beaux tableaux 


    #### INSTALLATION DES LIBRARY ####
    library(reshape)
    library(ggplot2)
    library(plyr)
    library(circlize)


    #### CREATION DES TABLEAUX : FCT MELT ####
    nba$Name <- with(nba, reorder(Name, GES))
    nba.m <- melt(nba)
    #pas utilise car ce n'est pas ce que je veux 
      #nba.m <- ddply(nba.m, .(variable), transform, value = scale(value))

    #### MODIFICATION DES DONNEES : FACTOR -> NUMERIQUES #### 
    # Convert the factor levels to numeric + quanity to determine size of hole.
    nba.m$var2 = as.numeric(nba.m$variable) + 15

    # Labels and breaks need to be added with scale_y_discrete.
    y_labels = levels(nba.m$variable)
    y_breaks = seq_along(y_labels) + 15


    #### CREATION DE LA HEATMAP (DEGRADE DE COULEUR UNIQUEMENT) ####
    p2 = ggplot(nba.m, aes(x=Name, y=var2, fill=value)) +
      geom_tile(colour="black") +
      scale_fill_gradient(low = "beige", high = "red") +
      ylim(c(0, max(nba.m$var2) + 0.5)) 
      scale_y_discrete(breaks=y_breaks, labels=y_labels) +
      coord_polar(theta="x") +
      theme(panel.background=element_blank(),
            axis.title=element_blank(),
            panel.grid=element_blank(),
            axis.text.x=element_blank(),
            axis.ticks=element_blank(),
            axis.text.y=element_text(size=5))

    ggsave(filename="plot_2.png", plot=p2, height=7, width=7)
    #sauvegarde le fichier dans le dossier 
    circos.par(start.degree = 90, gap.degree = 10)

    #### CREATION DE LA HEATMAP BIS (AVEC NOMS DE SITES )
    nba.labs <- subset(nba.m, variable==levels(nba.m$variable)[nlevels(nba.m$variable)])
    nba.labs <- nba.labs[order(nba.labs$Name),]
    nba.labs$ang <- seq(from=(360/nrow(nba.labs))/1.5, to=(1.5*(360/nrow(nba.labs)))-360, length.out=nrow(nba.labs))+80


    nba.labs$hjust <- 0
    nba.labs$hjust[which(nba.labs$ang < -90)] <- 1

    p3 = ggplot(nba.m, aes(x=Name, y=var2, fill=value)) +
      geom_tile(colour="black") +
      geom_text(data=nba.labs, aes(x=Name, y=var2+1.5,
                                   label=Name, angle=ang, hjust=hjust), size=3) +
      scale_fill_gradient(low = "beige", high = "steelblue") +
      ylim(c(0, max(nba.m$var2) + 1.5)) +
      scale_y_discrete(breaks=y_breaks, labels=y_labels) +
      coord_polar(theta="x") +
      theme(panel.background=element_blank(),
            axis.title=element_blank(),
            panel.grid=element_blank(),
            axis.text.x=element_blank(),
            axis.ticks=element_blank(),
            axis.text.y=element_text(size=5))
    nba.labs$ang[which(nba.labs$ang < -90)] <- (180+nba.labs$ang)[which(nba.labs$ang < -90)]

    #### Representation graphique ####
    p3
MrFlick
  • 195,160
  • 17
  • 277
  • 295
Stage
  • 35
  • 4

1 Answers1

2

Your example data is significantly different to the "nba" examples you have used in this and previous questions. This is why you are having so much trouble. Here are three ways to visualize your actual data:

1.) Tidyverse geom_tile() method:

library(tidyverse)
#install.packages("vroom")
library(vroom)
#install.packages("ggrepel")
library(ggrepel)

df <- vroom(file = "example.txt")

df %>% 
  pivot_longer(cols = -c(Name), names_to = "category") %>% 
  ggplot(aes(x = Name, y = category, fill = value)) +
  geom_tile(colour = "white") +
  geom_text(aes(label = ifelse(Name == "BCI0503", category, "")),
            nudge_x = -0.5) +
  geom_text_repel(aes(label = ifelse(category == "TEM", Name, "")),
                  nudge_y = 2) +
  scale_fill_gradient(low = "beige", high = "red", breaks = c(0, 1)) +
  coord_polar(theta="x") +
  theme_void()

example_1.png

This doesn't look great - you can make a "hole" in the middle, but I don't think that this type of plot is going to suit your data.

2.) Circlize circos plot

#install.packages("circlize")
library(circlize)

mat = data.matrix(df)[,2:9]
rownames(mat) <- df$Name
col_fun1 = colorRamp2(c(0, 1), c("beige", "red"))
circos.par(gap.after = c(20))
circos.heatmap(mat, col = col_fun1,
               rownames.side = "outside",
               track.height = 0.4)
circos.track(track.index = get.current.track.index(),
             panel.fun = function(x, y) {
  if(CELL_META$sector.numeric.index == 1) { # the last sector
    cn = colnames(mat)
    n = length(cn)
    circos.text(rep(CELL_META$cell.xlim[2], n) + convert_x(1, "mm"), 
                1:n - 0.5, cn, 
                cex = 0.5, adj = c(0, 0.5), facing = "inside")
  }
}, bg.border = NA)

example_2.png

This is better, but still not great. I believe the issue is that your "counts" are either 0 or 1 - I think another method is better suited to this type of data: UpSetR plots

3.) UpSetR plot

#install.packages("UpSetR")
library(UpSetR)

df_int <- df %>% 
  mutate(across(c(2:9), as.integer)) %>% 
  as.data.frame()

upset(df_int, order.by = "freq", sets.bar.color = "#56B4E9")

example_3.png

This is how I would visualize this type of data, but obviously it's up to you how you ultimately want to portray it.

If you had included a sample of your actual data in your original question this answer would have taken ~10 mins instead of an hour. Please, in future questions, ensure your example dataset accurately matches your actual data.

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • 1
    Good evening Jared, I am so impressed! I don't even know how to thank you! I'm a little beginner on r suddenly I had a lot of difficulty finding the best representation and especially looking for information. Thank you so much ! I was not sure how to represent this data at first and the circle seemed to be the best solution for this. Also, I had never seen the last graph you just made before and I don't know if I would be able to explain it afterwards. Thank you very much for your help and I will make sure that I present things better in the future! – Stage May 03 '21 at 02:43
  • Excuse me, I would like to ask you one last question. How can I use the last method with other data? – Stage May 03 '21 at 04:26
  • install.packages("UpSetR") install.packages("dplyr") library(UpSetR) library(dplyr) library(mutSignatures) library(foreach) df_int <- df mutate(across(c("Integrons_1", "Integrons_2", "Integrons_3", "TEM", "SHV","CTX_M", "NDM","OXA_48"), as.integer)) as.data.frame() ?as.integer upset(mutations) upset(df_int, order.by = "freq", sets.bar.color = "#56B4E9") – Stage May 03 '21 at 04:28
  • 1
    Sorry - `upset(mutations)` was a typo - I've edited my answer to remove it. To use this on your actual data you need to import the data in the correct format. Each column of the dataframe should be an integer. Depending on how you imported your dataset, that might already be the case - you can check with `str(dataframe)` to see what class your columns are. If they aren't integers, you need to change them into integers (that's what the `mutate()` function is doing). Then you can use the function like I've used it above – jared_mamrot May 03 '21 at 05:21
  • It's ok that work !! but we only compare 4 variables ? it's normal ? I dont have oxa 48, CTX-M, and NDM – Stage May 03 '21 at 06:24
  • I have to change le function ? – Stage May 03 '21 at 06:30
  • 1
    You can specify the number of variables to plot with the `nsets` parameter, e.g. `upset(df_int, order.by = "freq", sets.bar.color = "#56B4E9", nsets = 8)` – jared_mamrot May 03 '21 at 23:16
  • Good evening Jared, Thank you very much for the help you gave me to solve my problem and also for all the suggestions you made to me! – Stage May 03 '21 at 23:51