1

I am looking for a method to fill in data gaps in a time-series that contains periodic data (in this case with a frequency equal to a tidal frequency, so both semi-diurnal and spring/neap frequencies) using MATLAB. The data series also contains noise which I would like to superimpose on top of the artificial data that fills the time gap. The data has a certain trend that I want to preserve. Ideally I would be looking at a method that uses the recorded data on either side of the time gap.

Is there anyway of doing this in Matlab?

Thank you.

Donald John

  • Are you talking about time gaps of the order minutes (or hours) or are you talking about several periods missing, i.e. in the order of days? Can you provide some data or at least a plot of the data so one can see the gaps? As is right now, your question is very broad and vague, which makes it unlikely that you will get a helpful answer to your problem. – Nras Oct 23 '14 at 12:48
  • Unfortunately I can't post a plot, due to restrictions on my account (reputation related - first time using this forum). The data files are too large to post here, with a sampling rate of 5 mins. However there is approximately a week worth of data missing. If there is anything else I can add that would be helpful please let me know. – user2653752 Oct 23 '14 at 13:05
  • As it stands I think this is too broad, but you're basically looking at defining some sort of model from your existing data - as something like a summation of a couple of different types of periodic variation + some long-term trend + short term noise. – nkjt Oct 23 '14 at 13:25
  • Thanks. Do you have any suggestions about how I could go about defining such a model? I am not sure how I can be more specific, other than by posting a plot of the data, however that isn't available to me at the moment. – user2653752 Oct 23 '14 at 13:36
  • You can upload the image to any image hoster you like and put the link here. Next of us to come by will embed it. – bdecaf Oct 23 '14 at 13:53
  • Thanks for the suggestion, the images can be found here: http://postimg.org/image/nrumxv6ez/ and here: http://postimg.org/image/5nzz6s59f/ – user2653752 Oct 23 '14 at 14:07

1 Answers1

1

So what one can do is "guess" a modelfunction and fit the data by that model using some optimization routine. Then take a close look at the residuals and get the statistics which characterize the noise from that residuals. Then apply the model and add the noise. In Matlab Code the Ansatz could look like:

t_full = linspace(0,4*pi,500);
t = t_full([1:200, 400:end]);
f = 2;
A = 3;
D = 5;
periodic_signal = A*sin(t*f) + D;
trend = 0.2*t;
noise = randn(size(t));
y = periodic_signal + trend + noise;

% a model for the data -- haha i know the exact model here!
model = @(par, t) par(1)*sin(t*par(2)) + par(3) + par(4)*t;
par0 = [2, 2, 2, 2]; % and i can make a good guess for the parameters
par_opt = nlinfit(t,y, model, par0); % and optimize them

% now from the residuals (data minus model) one can guess noise
% characteristics
residual = y - model(par_opt, t);

% compare residual with "real noise" (should coincide if optimisation
% doesnt fail)
[mean(noise), mean(residual)] % about [0, 0]
[std(noise), std(residual)] % about [1, 1]
missing_data = 201:399;
new_noise = mean(residual) + std(residual)*randn(size(missing_data));

% show what is going on
figure
plot(t,y,'k.')
hold on
plot(t_full, model(par_opt, t_full), 'r-', 'linewidth', 2);
plot(t_full(missing_data), model(par_opt, t_full(missing_data)) + new_noise, 'r.')

legend('data', sprintf('y(t) = %.2f*sin(%.2f*t) + %.2f + %.2f*t + e(t)', par_opt), 'reconstructed data')

It results in the following graphic:

Nras
  • 4,251
  • 3
  • 25
  • 37
  • Thanks, this is exactly the type of thing I am looking for. I will try and apply something similar to my data. Thanks again for your help. – user2653752 Oct 23 '14 at 15:41
  • Just one other question, is there anyway to do something similar using interpolation or extrapolation as the starting point? – user2653752 Oct 23 '14 at 15:55
  • @user2653752 I'm afraid I don't understand that question. But looking at your data (btw: you should edit the links in the question) I feel like this method will fail as behaviour from Oct to Nov is quite different than from Mar to Oct. You do not seem to even have 1 full period, right? Not sure if a model will make it. Maybe cut off some data from the beginning and only use data from Sep (or Oct) to Nov could help. – Nras Oct 23 '14 at 16:01
  • Sorry, I am feeling around in the dark at the moment with this. I think the best place is to start with your method. You have a point regarding the change in behaviour, however it is fair to assume that the trend is downwards after October, then superimposed on this downward trend will be the tidal periods and noise. Finally having done a little reading I have come across the Kalman filter. Is there is a possibility that a Kalman filter might do the job? If the Kalman filter is a possibility how can I go implementing a Kalman filter in MATLAB? – user2653752 Oct 23 '14 at 16:27