0

I'm trying to read in a bunch of text files. There's a date column. The format in some of the files for the date column is DD-MMM-YYYY while in others, it's DD-MM-YYYY. I have the code set up to read the first style. But because of that, if it runs into the second type, the code stops because it can't read the file. How can I do something like, If thetextscandoesn't work, try this second way?

for n = 1:length(data1{id})
    fname1 = char(data1{id}(n));
    delimiter = '\t';
    startRow = 2;
    formatSpec = '%s%f%f%f%s%s%s%s%{dd-MMM-yyyy}D%s%s%f%f%f%f%f%f%s%s%s%s%s%s%s%s%f%f%[^\n\r]';
    fileID = fopen(fname1,'r');
    dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'EmptyValue' ,NaN,'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
    fclose(fileID); % Close the text file.
    PM25_1{id}{n} = table(dataArray{1:end-1}, 'VariableNames', {'MonitorID','POC','Latitude','Longitude','Datum','ParameterName','SampleDuration','PollutantStandard','DateLocal','UnitsofMeasure','EventType','ObservationCount','ObservationPercent','ArithmeticMean','FirstMaxValue','FirstMaxHour','AQI','MethodName','LocalSiteName','Address','StateName','CountyName','CityName','CBSAName','DateofLastChange','DateNum','NumberOfPOCs'});
    clearvars filename delimiter startRow formatSpec fileID dataArray ans;
end
Adriaan
  • 17,741
  • 7
  • 42
  • 75
SugaKookie
  • 780
  • 2
  • 17
  • 41

1 Answers1

2
try

for n = 1:length(data1{id})
    fname1 = char(data1{id}(n));
    delimiter = '\t';
    startRow = 2;
    formatSpec = '%s%f%f%f%s%s%s%s%{dd-MMM-yyyy}D%s%s%f%f%f%f%f%f%s%s%s%s%s%s%s%s%f%f%[^\n\r]';
    fileID = fopen(fname1,'r');
    dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'EmptyValue' ,NaN,'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
    fclose(fileID); % Close the text file.
    PM25_1{id}{n} = table(dataArray{1:end-1}, 'VariableNames', {'MonitorID','POC','Latitude','Longitude','Datum','ParameterName','SampleDuration','PollutantStandard','DateLocal','UnitsofMeasure','EventType','ObservationCount','ObservationPercent','ArithmeticMean','FirstMaxValue','FirstMaxHour','AQI','MethodName','LocalSiteName','Address','StateName','CountyName','CityName','CBSAName','DateofLastChange','DateNum','NumberOfPOCs'});
    clearvars filename delimiter startRow formatSpec fileID dataArray ans;
end

catch

for n = 1:length(data1{id})
    fname1 = char(data1{id}(n));
    delimiter = '\t';
    startRow = 2;
    formatSpec = '%s%f%f%f%s%s%s%s%{dd-MM-yyyy}D%s%s%f%f%f%f%f%f%s%s%s%s%s%s%s%s%f%f%[^\n\r]';
    fileID = fopen(fname1,'r');
    dataArray = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'EmptyValue' ,NaN,'HeaderLines' ,startRow-1, 'ReturnOnError', false, 'EndOfLine', '\r\n');
    fclose(fileID); % Close the text file.
    PM25_1{id}{n} = table(dataArray{1:end-1}, 'VariableNames', {'MonitorID','POC','Latitude','Longitude','Datum','ParameterName','SampleDuration','PollutantStandard','DateLocal','UnitsofMeasure','EventType','ObservationCount','ObservationPercent','ArithmeticMean','FirstMaxValue','FirstMaxHour','AQI','MethodName','LocalSiteName','Address','StateName','CountyName','CityName','CBSAName','DateofLastChange','DateNum','NumberOfPOCs'});
    clearvars filename delimiter startRow formatSpec fileID dataArray ans;
end

end

Wrap everything in a try/catch block. If the first style fails, try the next one (note that I changed the date-format in the catch part.) If you have even more possibilities, you'll want to check each style with something like an if/else clause.

Adriaan
  • 17,741
  • 7
  • 42
  • 75
  • Thanks. This works. How can I check the format of the date column with an `if/else` clause before I read it in? – SugaKookie Feb 27 '17 at 15:38
  • @shizishan you can't beforehand. I suggest reading the second line only, extract the date string, then determine its form. Save that form as a string, then read the whole file with the specified format. – Adriaan Feb 27 '17 at 15:41
  • How do I determine the form of the date string without having to look at it myself? – SugaKookie Feb 27 '17 at 16:06
  • @shizishan check for patterns: 3 letters -> must be `MMM`. 2 digits, 2 digits, 4 digits -> either `dd-mm-yyyy` or `mm-dd-yyyy`, so check whether any of the first two is larger than 12, so that fixes that. It's too broad a question though to answer here, so try it yourself first and if you experience problems, ask a new question. – Adriaan Feb 27 '17 at 16:12