I am using ggplot to create a boxplot. The code is the following:
ggplot(my_data, aes(x = as.factor(viotiko), y = pd_1year, fill = as.factor(viotiko))) + geom_boxplot() +
labs(title="Does the PD differ significantly by 'Viotiko' group?",x="Viotiko Group", y = "PD (pd_1year)")
This outputs the following graph:
Next, I wanted to focus in a range of the y values --[0, 0.05] -- and I run again the code with the parameters changed. I did not mean to exclude data and alter the mean and the distribution but simply to focus on a particular range of y values. The code was again this:
ggplot(my_data, aes(x = as.factor(viotiko), y = pd_1year, fill = as.factor(viotiko))) + geom_boxplot() +
labs(title="Does the PD differ significantly by 'Viotiko' group?",x="Viotiko Group", y = "PD (pd_1year)") +
scale_y_continuous(breaks =seq(0, .05, .01), limit = c(0, 0.05))
This returned a Warning "Removed 173664 rows containing non-finite values (stat_boxplot)." and outputted the following graph:
Apparently, ggplot somehow alters the input data on which the boxplot is based. However, my intention is simply to focus in segment of the box plot so that I can examine closer the differences between the groups. How can I do this using ggplot?
Your advice will be appreciated.