0

Imagine a set of data with given x-values (as a column vector) and several y-values combined in a matrix (row vector of column vectors). Some of the values in the matrix are not available:

%% Create the test data
N = 1e2; % Number of x-values

x = 2*sort(rand(N, 1))-1;
Y = [x.^2, x.^3, x.^4, x.^5, x.^6]; % Example values
Y(50:80, 4) = NaN(31, 1); % Some values are not avaiable

Now i have a column vector of new x-values for interpolation.

K = 1e2; % Number of interplolation values
x_i = rand(K, 1);

My goal is to find a fast way to interpolate all y-values for the given x_i values. If there are NaN values in the y-values, I want to use the y-value which is before the missing data. In the example case this would be the data in Y(49, :).

If I use interp1, I get NaN-values and the execution is slow for large x and x_i:

starttime = cputime;
Y_i1 = interp1(x, Y, x_i);
executiontime1 = cputime - starttime

An alternative is interp1q, which is about two times faster.

What is a very fast way which allows my modifications?

Possible ideas:

  1. Do postprocessing of Y_i1 to eliminate NaN-values.
  2. Use a combination of a loop and the find-command to always use the neighbour without interpolation.
Lukas
  • 2,330
  • 2
  • 22
  • 31
  • Have you thought about using k nearset neighbours imputation to fill i the missing fields? There is a matlab function for it http://www.mathworks.com/help/toolbox/bioinfo/ref/knnimpute.html but its in the bioinformatics toolbox :/ but it isn't such a difficult algorithm to implement. – Dan Aug 22 '12 at 09:10
  • From what I understand the input of interp1 should not contain nans. Try something like `Y_i1 = interp1(x(~isnan(Y)), Y(~isnan(Y)), x_i);` Probably better linewise. – bdecaf Aug 22 '12 at 09:28
  • @Dan: Without a special distance measure, your general idea is to complete the data before doing further linear interpolation. Right? I think, this is a quite nice idea, because I do the interpolation very often with the same underlying data. @bdecaf: Just eleminating the `NaN`s yields to strange interpolated values between the last valid value before the `NaN`s and the first valid value after them. Thus, this does yield the last valid value before the `NaN`s but a mixture of both. – Lukas Aug 22 '12 at 10:31

1 Answers1

1

Using interp1 with spline interpolation (spline) ignores NaN's.

AGS
  • 14,288
  • 5
  • 52
  • 67
  • Using `spline` is a good idea although it feels to be slower than linear interpolation. After interpolation I want to apply different calculations depending on the `x_i`-value. Is there an alternative to loop through all elements and have an if-statement for each element? – Lukas Aug 22 '12 at 11:46