I am experimenting with a open source code, which is based on learned index for DBMS indexing. Each index is considered as a model which predicts the position of the key in the dataset. I am trying to fit a quadratic regression model in place of linear regression. For the linear regression function of form y=ax+b, the spline connects points (xmax,ymax) and (xmin,ymin). x is the key value and y is the predicted position.CDF of position of keys is in the range of [0,1].
//Fitting linear spline
`if (mdl_->a_ <= 0) {`
`mdl_->a_ = (y_max_ - y_min_) / (x_max_ - x_min_);`
`mdl_->b_ = -static_cast<double>(x_min_) * mdl_->a_; + y_min_`
`}`
//Temporary model by in the range of CDF[0,1]
`T min_key = values[0].first;`
`T max_key = values[num_keys - 1].first;`
`root_->mdl_.a_ = 1.0 / (max_key - min_key);`
`root_->mdl_.b_ = -1.0 * min_key * root_node_->mdl_.a_;`
I suppose that in the temporary model, they have fitted ymin for 0 and ymax for 1. And I would like to find the coefficients of a quadratic function, by fitting a spline. Could someone please tell me whether how I could fit a quadratic spline using 2 points (min and max points)?
I tried using the interpolation formula. Quadratic spline interpolation
The predicted positions fit the parabola if the key values (x) are given sequentially. If the key values are given in random manner, the coefficient (c in ax2+bx+c) comes as very larger value which results in segmentation fault. And also the predicted positions don't fit the quadratic curve.