0

I am sorry, but I am really hopeless here using textscan and the format. I kept trying various way of getting the correct format to extract the time, lat, and lon from the following text;

    filename = 'A20020817_0610.20130725153026.L2.11479-3186.084800.0000.nc'

I want to extract the time from first part of the above filename: time: '20020817_0610' as (YYYYMMDD HH:MM)

then getting the lat and lon; lon; '11479' as (114.79) lat; '-3186' as (-31.86)

so so far I have the following format, but it does not work....??

   format_filename = '%*1n%8f%*1n%4f%*18n%5f%4f%*\n';
   read_filename = textscan(filename,format_filename);

it gives me empty doubles, so what do I do wrong?

Thank you so much for your help!!

sophie
  • 115
  • 1
  • 7

2 Answers2

0

There were some mistakes in your format string. Please refer to this link to understand how to use the format string correctly. http://www.mathworks.com/help/matlab/ref/textscan.html

filename = 'A20020817_1610.20130725153026.L2.11479-3186.084800.0000.nc';

format_filename = '%*1s %8d %*1s %4d %*19s %5d %*1s %4d %*[^\n]';
data = textscan(filename,format_filename);

data = 

[20020817]    [1610]    [11479]    [3186]

You will have to convert the decimals 11479 and 3186 into the desired decimal values (not too hard, just divide by 100). I don't think MATLAB will automatically take a string representation of a decimal such as 11479 and automatically format it just because you've put %5.2f in the corresponding location of your format string. After all, the format string determines the value MATLAB should expect to read.

EDIT: The following will read the time as a string and include the sign of latitude value as Eitan suggested.

format_filename = '%*1s %8d %*1s %4s %*19s %5d %5d %*[^\n]';
Falimond
  • 608
  • 4
  • 11
  • 1
    It would be a better idea to extract the date and time stamp as strings. For example, if the time is a minute past midnight, you'd want to see `'0001'` instead of just `1`, wouldn't you? Also, you didn't extract the sign of the latitude value... – Eitan T Nov 24 '13 at 21:31
  • Thanks for the suggestion, it is included in the edit. Although these are all easily changed. The OP's format string was not valid and was causing an error to begin with. – Falimond Nov 24 '13 at 21:41
  • What was previously %*1s %4d is now %5d. Matlab will automatically consider the sign and produce -3186. – Falimond Nov 24 '13 at 22:01
0

Using textscan

While textscan can definitely work here, it can get a little confusing when trying to parse long strings:

res = textscan(filename, 'A%8s_%4s.%*[^.].L2.%5f%5f*[^\n]');
res = [res{:}]; %// Flatten array

The longitude and latitude values requires one more manipulation:

res(3:4) = num2cell([res{3:4}] / 100);

Using regular expressions

An alternative way of tokenizing strings is using regular expressions:

res = regexp(filename, '^A(\d+)_(\d+)\..*\.L2\.([^.]{5})([^.]{5})', 'tokens');
res = res{1};   %// Flatten array

Once you've done that, you can convert the longitude and latitude strings to numerical values:

res(3:4) = cellfun(@(x){str2num(x) / 100}, res(3:4));

Sometimes the latter method is easier than using textscan.

In any case, the result for your input string should be the following cell array:

res = 
    '20020817'    '0610'    [114.7900]    [-31.8600]
Eitan T
  • 32,660
  • 14
  • 72
  • 109