0

I've been trying to solve a problem using Lagrange interpolation, which is implemented in poly.calc method (polynom package) in R language.

Basically, my problem is to predict the population of a certain country using Lagrange Interpolation. I have the population from the past years (1961 - 2014). The csv file is here

w1 = read.csv(file="country.csv", sep=",", head=TRUE)
array_x = w1$x
array_y = w1$y

#calls Lagrange Method
p = poly.calc(array_x, array_y)

#create a function to evaluate the polynom
prf <- as.function(p)
#create some points to plot
myx = seq(1961, 2020, 0.5)
#y's to plot
myy = prf(myx)
#plot
plot(myx, myy,col='blue')

After that, the plotted curve is declining and the y-axis is (very big) negative (power of 134). It does not make sense. However, if I use like five points, it is correct.

horseoftheyear
  • 917
  • 11
  • 23
Emanuel
  • 1
  • 3

1 Answers1

0

This is not really an SO question but rather a numerical analysis question.

R is doing everything you want it to, it's not a programming error. It's just that what you want it to do is notoriously bad. Lagrange polynomials are notorious for being incredibly unstable, especially when a large number of points are fit.

A much more stable alternative is the use of splines, such as B-splines. They can be fit very easily with R's default spline library into any regression model, i.e. you could fit a least squares model with

library(splines)
x <- sort(runif(500, -3,3) ) #sorting makes for easier plotting ahead
y <- sin(x)
splineFit <- lm(y ~ bs(x, df = 5) )
est_y <- predict(splineFit)
plot(x, y, type = 'l')
lines(x, est_y, col = 'blue')

You can see from the above model that the splines can do a good job of fitting non-linear relations.

Cliff AB
  • 1,160
  • 8
  • 15