0

For making a scatterplot of a flow cytometry experiment, I used the following code for the last few weeks. While running the code, I did not receive a significant amount of errors (only 20 out of 10,000 entries were removed from the dataset).

visual <- ggplot(data=dots, aes(GRNHLin, REDHLin)) +
    geom_point(colour=rgb(0.17, 0.44, 0.71), size=0.500, alpha=0.250) +
    #stat_density_2d(aes(alpha = ..density..), geom = 'tile', contour = FALSE) +
    geom_tile(aes(width = 0.004)) +
    scale_x_log10(breaks = trans_breaks("log10", function(x) 10^x),
                  labels = trans_format("log10", math_format(10^.x)), limits = c(1,1e4)) +
    scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
                  labels = trans_format("log10", math_format(10^.x)), limits = c(1,1e3)) +
    geom_vline(xintercept=threshold_green) +
    labs(x="Green Fluorescence Intensity", y="Red Fluorescence Intensity", size=18)
visual

Now that I want to use model-based clustering from the Mclust-package, I use fviz_cluster as a function to create a scatterplot. Only this time, I receive the following error after running the code below.

Warning messages: 1: In self$trans$transform(x) : NaNs produced 2: Transformation introduced infinite values in continuous x-axis 3: In self$trans$transform(x) : NaNs produced 4: Transformation introduced infinite values in continuous y-axis 5: Removed 9110 rows containing missing values (geom_point).

dots <- read_csv(file_of_sample)
names(dots) <- str_replace_all(names(dots), c("-" = ""))
  
dots <- dots %>%
  select("GRNHLin", "RED2HLin")
    
dots <- dots %>%
  filter(RED2HLin >= 1.0)

dots.Mclust <- Mclust(dots, modelNames="VVV", G=8)
#BIC <- mclustBIC(logdots)
#ICL <- mclustICL(logdots)
#summary(BIC)
#summary(ICL)

visual <- fviz_cluster(dots.Mclust, 
             ellipse=FALSE, 
             shape=20, 
             ellipse.alpha = 0.1,
             alpha=0.450, 
             geom = c("point"),
             show.clust.cent = FALSE,
             main = FALSE,
             legend = c("right"),
             palette = "npg",
             legend.title = "Clusters"
             ) +
  labs(x="Green Fluorescence Intensity", y="Red Fluorescence Intensity") +
  scale_x_log10(breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)), limits = c(1,1e4)) +
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)), limits = c(1,1e3))

visual

Can anyone help me out with this issue?

LiWa
  • 51
  • 2
  • You obviously have zero (or perhaps negative) values and are doing a `log` transformation, and log(0) is not defined. However, it is just a warning, not an error. `ggplot` will usually either just remove them or plot them along the axis (depending on the geom). – Andrew Gustar Dec 20 '21 at 11:58
  • Thanks for your comment. I double checked this, but there are <20 zero/negative values in my dataset of 10,000 entries. In my first script, there were at max 20 warnings, whereas in my second script, there were more than 9,000 warnings. How can the second script produce so many warnings, if I am still using the same dataset? – LiWa Dec 20 '21 at 14:07
  • I'm not familiar with fviz_cluster, but perhaps it is plotting more than just the data points. Maybe it is the points forming the boundary of a shape that are falling into the negative region??? – Andrew Gustar Dec 20 '21 at 15:05

0 Answers0