0

I currently have a dataset of dates, a company identifier and a value of interest in a csv file. Both the company identifier and the value are numerical. My data is currently in a flat file format so I currently have rows like the following

companyid date value

1111 09/14/1986 1234

1111 10/14/1986 5678

1111 11/14/1986 9012

In other words, I have time series in a flat file format. I would like to condense this data by constructing a time series object for each company. Then I would like to produce time series plots of certain quantiles of the values at each point in time, aggregated across all of the companies. Other things to point out are that companyid/date pairs are unique so there are no duplicates in the dataset and the data is already sorted by companyid and date.

Here's what I've tried thus far:

% col 1 = companyid, col 2 = date, col 3 = value
[rows, cols] = size(data);
distinct_comp = 0;
for ii=1:rows
    if data(ii, 1) ~= data(ii-1,1)
        distinct_comp = distinct_comp + 1;
    end
end
disp distinct_comp

%Create initial time series object and place data(1,3) and data(1,2) inside
for jj = 2:rows
    if data(jj,1)==data(jj-1,1)
        % Add data (jj,2) and data(jj, 3) to existing time series object
    else
        % Create new time series object and add data(jj,2) and data(jj,3)
    end
end
% disp number of time-series objects to check if same as distinct_comp
user6893
  • 143
  • 1
  • 2
  • 9
  • How far did you get? Show us your code. – Daniel Feb 02 '14 at 14:16
  • @Daniel Added starting code. One other thing that I didn't mention before is that there are 2 million total rows so if anybody has a more efficient approach to get the quantile time series plots I would appreciate hearing about it. – user6893 Feb 02 '14 at 16:06

0 Answers0