I'll start out by saying that I'm something of a Matlab newbie so I apologise if this question has been asked before but I have been unable to find the answer I need.
I'm working on a code to process some large Rainfall Radar files (5-20GB), I have arranged the code to read the data in 24 hour blocks and then remove some unnecessary points. As below:
clear all; close all; clc;
fileID=fopen('Selkirk_15min_2008.csv');
% Open relevant Hyrad Rainfall Radar Output File.
C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');
% Read the file's header section - first 12 row in standard output file.
C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');
% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.
C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');
% Read the column headers of the main data array.
formatSpec='%s %s %s %s %s %s %s %s';
N=118080;
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period.
C_Day1=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
load('SelkirkNRP.mat')
% % Load list of non-relevant grid numbers for use.
C_Day1(any(ismember(C_Day1(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
% ------------------SECOND 24 HOUR PERIOD--------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data
C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
%Combine data arrays created by textscan into one matrix
C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
%------------------THIRD 24 HOUR PERIOD-----------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the next 118,080 lines of data
C_Day3=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
C_Day3(any(ismember(C_Day3(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
I hope this code makes clear what I'm trying to achieve. If not feel free to ask.
Essentially, what I need is to make this section of the code:
% ------------------SECOND 24 HOUR PERIOD--------------------------------
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% NB - 2000 Days between March 1st, 2008 and August 22nd, 2013
% No of Obs Per day = 118,080
% No of Obs in Dataset = 236,160,000 (rows - each row is 8 cells)
% Load the next 118,080 lines of data
C_Day2=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
%Combine data arrays created by textscan into one matrix
C_Day2(any(ismember(C_Day2(:,1),SelkirkNRP),2),:)=[];
% % Check first column against list of non-relevant grid points and remove
% % all non-relevant rows
disp('24 Hour period loaded and non-relevant grid points removed')
Repeat until the entire file has been processed. I think this should be about 2000 iterations. Also, I know this code is very rough and ready at the moment and not all that elegant, any helpful comments for a newbie will be gratefully received.
Hoping you can help.
Best
Sam
Update:
After digging this issued was solved and simplified with a For Loop, example code below.
clear all; close all; clc;
fileID=fopen('Selkirk_15min_2008.csv');
% Open relevant Hyrad Rainfall Radar Output File.
C_Header=textscan(fileID,'%s %s %s %s %s %s %s %s',12,'delimiter',',');
% Read the file's header section - first 12 row in standard output file.
C_GridDataCheck=textscan(fileID,'%s %s %d',1230,'delimiter',',');
% Read the data coverage section of the file - number of rows eg 1230 is
% equal to the number of grid points used.
C_DataHeader=textscan(fileID,'%s %s %s %s %s %s %s %s',1,'delimiter',',');
% Read the column headers of the main data array.
formatSpec='%s %f %f %f %f %s %s %f';
N=118080;
load('SelkirkNRP.mat')
% Load list of non-relevant grid numbers for use.
for i=1:1
C_Read=textscan(fileID,formatSpec,N,'delimiter',',');
% Load the first 118,080 rows of data (1230 = 1 x 15 Min Timestep across
% whole catchment. 1230x4 = 4,920. 4,920x24 = 118,080. Hence this sample
% size is equal to one 24 hour period.
C_Data=horzcat(C_Read{1,1},C_Read{1,2},C_Read{1,3},C_Read{1,4},C_Read{1,5},C_Read{1,6},C_Read{1,7},C_Read{1,8});
% Combine data arrays created by textscan into one matrix
C_Data(any(ismember(C_Data(:,1),SelkirkNRP),2),:)=[];
% Check first column against list of non-relevant grid points and remove
% all non-relevant rows
C_DataMatrix=str2double(C_Data);
% Convert Cell Array to Matrix for writing to CSV
csvwrite(['Day ' num2str(i) ' Data'], C_DataMatrix)
%Write to CSV
end