0

I have trade off curves generated from a simulation where the points are not entirely smooth. I'd like to fit a spline and calculate the max. Due to the jagged nature of the points, especially near the peak, it doesn't make sense to fit a spline to the entire curve. Instead, I would like to discard most (not all) uninformative parts of the curve (where points are not uniformly increasing/decreasing) so I can fit a better spline.

Here are 3 example curves.

structure(list(x = c(0.01, 0.0615789473684211, 0.113157894736842, 
0.164736842105263, 0.216315789473684, 0.267894736842105, 0.319473684210526, 
0.371052631578947, 0.422631578947368, 0.474210526315789, 0.52578947368421, 
0.577368421052632, 0.628947368421053, 0.680526315789474, 0.732105263157895, 
0.783684210526316, 0.835263157894737, 0.886842105263158, 0.938421052631579, 
0.99), y = c(0.67455086336912, 0.798345993110331, 0.842040269562323, 
0.876826597424532, 0.903925678739277, 0.922394814941881, 0.937688904117379, 
0.952464844120848, 0.959010185199922, 0.960880828342136, 0.968131047741576, 
0.967998560989975, 0.963479216138144, 0.947228114551568, 0.930237609460854, 
0.900655040504384, 0.849692079994379, 0.793895388319592, 0.715064370596625, 
0.586048018267692)), .Names = c("x", "y"), row.names = c(NA, 
-20L), class = "data.frame")


structure(list(x = c(0.01, 0.0615789473684211, 0.113157894736842, 
0.164736842105263, 0.216315789473684, 0.267894736842105, 0.319473684210526, 
0.371052631578947, 0.422631578947368, 0.474210526315789, 0.52578947368421, 
0.577368421052632, 0.628947368421053, 0.680526315789474, 0.732105263157895, 
0.783684210526316, 0.835263157894737, 0.886842105263158, 0.938421052631579, 
0.99), y = c(0.471958752798607, 0.703622592901373, 0.75810009343474, 
0.804310552037636, 0.833980459996575, 0.858824563652134, 0.880427517647614, 
0.898865451764142, 0.91097315832838, 0.926239088224158, 0.933622984262967, 
0.940626455467701, 0.944787886406163, 0.940076642124325, 0.938416888413051, 
0.918335717622263, 0.886320360986925, 0.84476577117581, 0.782884400269357, 
0.680906832291339)), .Names = c("x", "y"), row.names = c(NA, 
-20L), class = "data.frame")


structure(list(x = c(0.01, 0.0615789473684211, 0.113157894736842, 
0.164736842105263, 0.216315789473684, 0.267894736842105, 0.319473684210526, 
0.371052631578947, 0.422631578947368, 0.474210526315789, 0.52578947368421, 
0.577368421052632, 0.628947368421053, 0.680526315789474, 0.732105263157895, 
0.783684210526316, 0.835263157894737, 0.886842105263158, 0.938421052631579, 
0.99), y = c(0.92758700570137, 1.01008397567002, 1.06088676091004, 
1.10000024909512, 1.13048775170492, 1.15176599158524, 1.17517328999588, 
1.17489937140761, 1.22188964933233, 1.2073639239554, 1.1686005354797, 
1.15073969246578, 1.22855065420885, 1.30756438989448, 1.27583479809645, 
1.22221301838554, 1.15463640322463, 1.08629005387455, 1.00778694924309, 
0.917029835140733)), .Names = c("x", "y"), row.names = c(NA, 
-20L), class = "data.frame")

Here is my spline function.

library(mgcv)
cspline <- function(data) {
m <- data$x
lambda <- data$y
m1 <- mgcv::gam(lambda ~ s(m, fx = FALSE, k=-1, bs = "cr"))
predm <- predict(m1, data.frame(m = seq(0.01, 0.99, length = 100000)), se=TRUE)$fit
pm <- seq(0.01, 0.99, length = 100000)
maxy <- max(predm)
maxx <- pm[which(predm == max(predm))]
return(list(maxx, maxy))
}

Can anyone suggest an algorithm or ideas that would allow me to focus on the part of the curve where there is more variation in points and just fit a spline to those (and discarding the other points on either side of the peak that don't contribute more information)?

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
  • One idea I have is to start at the peak and go on either side of it till I reach 3-4 points that are decreasing. Then just subset that part of the curve and pass it to my spline function. Doesn't seem to be the most efficient way. – Maiasaura Oct 01 '12 at 17:28
  • The spline will fit locally - surely you just want to identify local minima and maxima of the fitted splines? – mnel Oct 02 '12 at 00:44

0 Answers0