1

I have a plot where I am plotting both the linear regressions for each level of a variable as well as the linear regression for the total sample.

library(ggplot2);library(curl)
df<-read.csv(curl("https://raw.githubusercontent.com/megaraptor1/mydata/main/example.csv"))df$group<-as.factor(df$group)
ggplot(df,aes(x,y))+
  geom_point(size=2.5,shape=21,aes(fill=group),col="black")+
  geom_smooth(formula=y~x,aes(col=group,group=group),method="lm",size=1,se=F)+
  geom_smooth(formula=y~x,method="lm",col="black",size=1,fullrange=T,se=F)+
  theme_classic()+
  theme(legend.position = "none")  

enter image description here

I am trying to extend the black line (which represents all specimens) to span the full range of the axes using the command fullrange=T. However, I have found the command fullrange=T is not working on this graph regardless of what I try. This is especially strange as I have not called any limits for the graph or set any additional global factors.

This question was the closest I was able to find to my current problem, but it does not appear to be describing the same issue because that issue had to do with how the limits of the graph were called.

user2352714
  • 314
  • 1
  • 15
  • I _think_ you want to add `+ scale_(x|y)_continuous(expand = expansion(0, 0))`, see https://ggplot2.tidyverse.org/reference/scale_continuous.html – markus Apr 05 '21 at 20:12
  • @markus I think OP wants the fitted line to extend beyond the data, not to shrink the axes. (Though they may be related...) – Gregor Thomas Apr 05 '21 at 20:29
  • @GregorThomas That was my first thought too. Wrote my comment because OP mentioned the other post in which "... the limits of the graph were called." – markus Apr 05 '21 at 20:32

2 Answers2

0

This seems a bit heavy handed but allows you to extent your regression line to whatever limits you choose for the x axis.

The argument fullrange is not really documented very helpfully. If you have a look at http://www.mosaic-web.org/ggformula/reference/gf_smooth.html it appears that "fullrange" applies to the points in the dataframe that is used to generate the regression line. So in your case your regression line is extending to the "fullrange". It's just that your definition of "fullrange" is not quite the same as that used by geom_smooth.

library(ggplot2)
library(dplyr)
library(curl)

lm_formula <- lm(formula = y~x, data = df)

f_lm <- function(x){lm_formula$coefficients[1] + lm_formula$coefficients[2] * x}

df_lim <- 
  data.frame(x = c(0, 5)) %>% 
  mutate(y = f_lm(x))

ggplot(df,aes(x,y))+
  geom_point(size=2.5,shape=21,aes(fill=group),col="black")+
  geom_smooth(formula=y~x,aes(col=group,group=group),method="lm",size=1,se=F)+
  geom_line(data = df_lim)+
  coord_cartesian(xlim = df_lim$x, ylim = df_lim$y, expand = expansion(mult = 0))+
  theme_classic()+
  theme(legend.position = "none") 

data

df<-read.csv(curl("https://raw.githubusercontent.com/megaraptor1/mydata/main/example.csv"))
df$group<-as.factor(df$group)

Created on 2021-04-05 by the reprex package (v1.0.0)

Peter
  • 11,500
  • 5
  • 21
  • 31
0

I had the same issue. Despite setting fullrange = TRUE, the line of best fit was only being drawn in the data range.

    ggplot(data = df, aes(x = diameter, y = height)) +
      geom_point(size = 2) +
      geom_smooth(method = lm, se = FALSE, fullrange = TRUE) +
      labs(x = "Diameter", y = "Height", title = "Tree Height vs. Diameter") +
      theme(plot.title = element_text(hjust = 0.5, size = 15, face = 'bold')) 

Bad plot: 1

Using scale_x_continuous() and scale_y_continuous() worked for me (thank you @markus). I added two lines of code, below geom_smooth(), to fix the issue.

    ggplot(data = df, aes(x = diameter, y = height)) +
      geom_point(size = 2) +
      geom_smooth(method = lm, se = FALSE, fullrange = TRUE) +
      scale_x_continuous(expand = c(0,0), limits=c(5, 32)) +   #expand = c(num1,num2) => line of best fit stops being drawn at x = 32 + (32 - 5)*num1  + num2 = 32 + (32 - 5)*0 + 0 = 32
      scale_y_continuous(expand = c(0,0), limits=c(7, 25)) +   #expand = c(num1,num2) => line of best fit stops being drawn at y = 25 + (25 - 7)*num1  + num2 = 25 + (25 - 7)*0 + 0 = 25
      labs(x = "Diameter", y = "Height", title = "Tree Height vs. Diameter") +
      theme(plot.title = element_text(hjust = 0.5, size = 15, face = 'bold')) 

Good plot: 2

Source: How does ggplot scale_continuous expand argument work?