9

I have two linear fits that I've gotten from lm calls in my R script. For instance...

fit1 <- lm(y1 ~ x1)
fit2 <- lm(y2 ~ x2)

I'd like to find the (x,y) point at which these two lines (fit1 and fit2) intersect, if they intersect at all.

Hack-R
  • 22,422
  • 14
  • 75
  • 131
CodeGuy
  • 28,427
  • 76
  • 200
  • 317

3 Answers3

13

Here's some high school geometry then ;-)

# First two models
df1 <- data.frame(x=1:50, y=1:50/2+rnorm(50)+10)
m1 <- lm(y~x, df1)

df2 <- data.frame(x=1:25, y=25:1*2+rnorm(25)-10)
m2 <- lm(y~x, df2)

# Plot them to show the intersection visually    
plot(df1)
points(df2)

# Now calculate it!    
a <- coef(m1)-coef(m2)
c(x=-a[[1]]/a[[2]], y=coef(m1)[[2]]*x + coef(m1)[[1]])

Or, to simplify the solve-based solution by @Dwin:

cm <- rbind(coef(m1),coef(m2)) # Coefficient matrix
c(-solve(cbind(cm[,2],-1)) %*% cm[,1])
# [1] 12.68034 16.57181 
Tommy
  • 39,997
  • 12
  • 90
  • 85
9

One way to avoid the geometry is to re-parameterize the equations as:

y1 = m1 * (x1 - x0) + y0
y2 = m2 * (x2 - x0) + y0

in terms of their intersection point (x0, y0) and then perform the fit of both at once using nls so that the returned values of x0 and y0 give the result:

# test data
set.seed(123)
x1 <- 1:10
y1 <- -5 + x1 + rnorm(10)
x2 <- 1:10
y2 <- 5 - x1 + rnorm(10)
g <- rep(1:2, each = 10) # first 10 are from x1,y1 and second 10 are from x2,y2

xx <- c(x1, x2)
yy <- c(y1, y2)
nls(yy ~ ifelse(g == 1, m1 * (xx - x0) + y0, m2 * (xx - x0) + y0),
    start = c(m1 = -1, m2 = 1, y0 = 0, x0 = 0))

EDIT: Note that the lines xx<-... and yy<-... are new and the nls line has been specified in terms of those and corrected.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • this is perfect. is it possible for you to show me, using this code you already have, how I can specify that the following requirement: the solution (the two lines) must have an intersection point between z1 and z2 (two values I specify) – CodeGuy Aug 19 '11 at 14:59
  • @CodeGuy, Specify the arguments: `algorithm = "port", lower = ...whatever..., upper = ...whatever...` as per `?nls`. – G. Grothendieck Aug 19 '11 at 15:59
  • I'm not familiar with what you mean. Can you add it to the code and show me? – CodeGuy Aug 19 '11 at 16:27
  • 1
    @CodeGuy, Assuming you want to restrict `x0` to lie between `2` and `4`: `nls(c(y1, y2) ~ ifelse(g == 1, b1 * (x1 - x0) + y0, b2 * (x2 - x0) + y0), start = c(b1 = -1, b2 = 1, y0 = 0, x0 = 3), algorithm = "port", lower = c(b1 = -Inf, b2 = -Inf, y0 = -Inf, x0 = 2), upper = c(b1 = Inf, b2 = Inf, y0 = Inf, x0 = 4))` – G. Grothendieck Aug 19 '11 at 16:55
  • wait a sec. the input is just x and y values. I don't have two sets of (x,y) values. Just one set. – CodeGuy Aug 19 '11 at 16:58
  • @CodeGuy, Better read your own post. You have two sets according to it. – G. Grothendieck Aug 19 '11 at 18:13
  • sorry about the confusion. I keep changing my code. you're right. your code is EXTREMELEY helpful (and works great). thank you so much. the thing I'm stuck on in the understanding of it is how you re-parameterized the two equations. how did you derive that? Can you show the derivation in your answer? – CodeGuy Aug 19 '11 at 18:39
  • @CodeGuy, If we regard `(x0, y0)` as the origin then its a line through the origin whose slope is `b1 = (y1-y0)/(x1-x0)` hence we have it upon rearrangement. To verify it once we have the formula just substitute `x0` into `x`. In that case the first term on the RHS of `y1 = b1*(x-x0) + y0` is zero so `y1 = y0` as required. – G. Grothendieck Aug 19 '11 at 19:14
  • hm. not understanding that so well. could you elaborate? sorry! – CodeGuy Aug 19 '11 at 19:33
  • @CodeGuy. The slope is defined as rise over run. If we are at a point (x1, y1) then the rise to that point from the intersection `(x0, y0)` is `y1-y0` and the run is `x1-x0` so the slope, `b1` equals their ratio so `b1 = (y1-y0)/(x1-x0)` and rearranging gives `y1` in terms of the other variables. – G. Grothendieck Aug 19 '11 at 20:37
  • okay, that makes sense. one last thing. there is a slight problem. so you know how I want to specify the intersection point? this will always be between two whole numbers (such as xLow=3 and xHigh=4). but the two lines that are made take into consideration all points. how do I make it so that the first line only takes into consideration points whose x value are less than xLow and the second line only takes into consideration points whose x are greater than xHigh – CodeGuy Aug 19 '11 at 20:45
  • @CodeGuy. The first line is only fit to the ten (x1, y1) points and the second line is only fit to the ten (x2, y2) points. If you want to exclude some of these points just exclude them from x1, y1, x2 and y2. – G. Grothendieck Aug 19 '11 at 21:06
  • both slopes should always be positive. so I changed the starting value for b1 to positive 1 (instead of negative 1, as in your example), and now I'm getting an error – CodeGuy Aug 19 '11 at 21:21
  • do you have an email I can send you a picture? it doesn't seem to be using the right numbers. I think there is a mistake in that nls call. – CodeGuy Aug 19 '11 at 21:27
  • @CodeGuy. Pasting the code I posted into R gives a number near `x0 = 5` so its seems correct to me. You can edit the question and add the image to it but more importantly you need to think through what your question really is and how to express it since the question you have stated in the post has really been answered and elaborated. – G. Grothendieck Aug 19 '11 at 21:58
  • try these values: x1 <- c(1,2) y1 <- c(10,10) x2 <- c(3,4,5) y2 <- c(20,30,40) and let g = c(TRUE,TRUE,FALSE,FALSE,FALSE) and require the intersection to be between x=2 and x=3. In this case, the solution is incorrect. The coefficients m1=12.86, m2=-2.86, x0=3.0, and y0=31.43, do not come out as "looking" like the best solution at all. – CodeGuy Aug 19 '11 at 22:11
  • clearly, the first slope should be VERY flat in this case, as the first two points are the same. but your nls() is spitting out a slope of almost 13 – CodeGuy Aug 19 '11 at 22:21
  • @Gabor, +1 for patience. My code posted on the other question is very similar to yours ... – Ben Bolker Aug 19 '11 at 22:36
  • one more thing...how can i require that the slope of the 2nd line must be larger than the slope of the first line (the V that results from two lines must be concave) – CodeGuy Aug 20 '11 at 19:08
  • @CodeGuy, Replace `m2` with `(m1+delta)` so that the parameters become `m1`, `delta`, `x0` and `y0`. Then give `delta` a lower bound of zero in the nls call. – G. Grothendieck Aug 22 '11 at 04:20
  • +1!! This option also provides a straightforward way to obtain the standard error of the intercept coordinates by summarizing the NLS model! – Cris Nov 22 '21 at 08:49
3

I am a little surprised there isn't a built in function for this.

Here is a rudimentary function (for lm results), using the same general method as Tommy above. This uses the simple substitution method for two lines in the form "y=mx+b" to find the common intersection at y (y1=y2 ; m1*x + b1 = m2*x + b2) and solves for x:

Function definition

# Linear model Intercept function
lmIntx <- function(fit1, fit2, rnd=2) {
  b1<- fit1$coefficient[1]  #y-int for fit1
  m1<- fit1$coefficient[2]  #slope for fit1
  b2<- fit2$coefficient[1]  #y-int for fit2
  m2<- fit2$coefficient[2]  #slope for fit2
  if(m1==m2 & b1==b2) {print("Lines are identical")
  } else if(m1==m2 & b1 != b2) {print("Lines are parallel")
  } else {
    x <- (b2-b1)/(m1-m2)      #solved general equation for x
    y <- m1*x + b1            #plug in the result
    data.frame(x=round(x, rnd), y=round(y, rnd))
  }
}

Test:

line1 <- data.frame(x=c(0,1), y=c(0,2))
line2 <- data.frame(x=c(0,1), y=c(1,3))
line3 <- data.frame(x=c(0,1), y=c(1,5))

lmIntx(lm(line1$y~line1$x), lm(line2$y~line2$x))
[1] "Lines are parallel"


lmIntx(lm(line1$y~line1$x), lm(line1$y~line1$x))
[1] "Lines are identical"

lmIntx(lm(line1$y~line1$x), lm(line3$y~line3$x))
               x  y
(Intercept) -0.5 -1
Matt L.
  • 2,753
  • 13
  • 22