0

I am trying to read the file with the following format which repeats itself (but I have cut out the data even for the first repetition because of it being too long):

1.00 'day' 2011-01-02
'Total Velocity Magnitude RC - Matrix' 'm/day'
    0.190189     0.279141     0.452853      0.61355     0.757833     0.884577 
    0.994502      1.08952      1.17203      1.24442      1.30872      1.36653 
     1.41897      1.46675      1.51035      1.55003      1.58595      1.61824

Download the actual file with the complete data here

This is my code which I am using to read the data from the above file:

fid = fopen(file_name); % open the file

dotTXT_fileContents = textscan(fid,'%s','Delimiter','\n'); % read it as string ('%s') into one big array, row by row
dotTXT_fileContents = dotTXT_fileContents{1};
fclose(fid); %# don't forget to close the file again

%# find rows containing 'Total Velocity Magnitude RC - Matrix' 'm/day'
data_starts = strmatch('''Total Velocity Magnitude RC - Matrix'' ''m/day''',...
    dotTXT_fileContents); % data_starts contains the line numbers wherever 'Total Velocity Magnitude RC - Matrix' 'm/day' is found

ndata = length(data_starts); % total no. of data values will be equal to the corresponding no. of '**  K' read from the .txt file

%# loop through the file and read the numeric data
for w = 1:ndata-1
    %# read lines containing numbers
    tmp_str = dotTXT_fileContents(data_starts(w)+1:data_starts(w+1)-3); % stores the content from file dotTXT_fileContents of the rows following the row containing 'Total Velocity Magnitude RC - Matrix' 'm/day' in form of string
    %# convert strings to numbers
    tmp_str = tmp_str{:}; % store the content of the string which contains data in form of a character
    %# assign output
    data_matrix_grid_wise(w,:) = str2num(tmp_str); % convert the part of the character containing data into number
end

To give you an idea of pattern of data in my text file, these are some results from the code:

data_starts =

           2
        1672
        3342
        5012
        6682
        8352
       10022

ndata =

     7

Therefore, my data_matrix_grid_wise should contain 1672-2-2-1(for a new line)=1667 rows. However, I am getting this as the result:

data_matrix_grid_wise =

  Columns 1 through 2

   0.190189000000000   0.279141000000000
   0.423029000000000   0.616590000000000
   0.406297000000000   0.604505000000000
   0.259073000000000   0.381895000000000
   0.231265000000000   0.338288000000000
   0.237899000000000   0.348274000000000

  Columns 3 through 4

   0.452853000000000   0.613550000000000
   0.981086000000000   1.289920000000000
   0.996090000000000   1.373680000000000
   0.625792000000000   0.859638000000000
   0.547906000000000   0.743446000000000
   0.562903000000000   0.759652000000000

  Columns 5 through 6

   0.757833000000000   0.884577000000000
   1.534560000000000   1.714330000000000
   1.733690000000000   2.074690000000000
   1.078000000000000   1.277930000000000
   0.921371000000000   1.080570000000000
   0.934820000000000   1.087410000000000

Where am I wrong? In my final result, I should get data_matrix_grid_wise composed of 10000 elements instead of 36 elements. Thanks.

Update: How can I include the number before 'day' i.e. 1,2,3 etc. on a line just before the data_starts(w)? I am using this within the loop but it doesn't seem to work:

days_str = dotTXT_fileContents(data_starts(w)-1);
    days_str = days_str{1};
    days(w,:) = sscanf(days_str(w-1,:), '%d %*s %*s', [1, inf]);
  • something wrong with you `dotTXT_fileContents()`? where is its code? –  Mar 17 '12 at 13:44
  • I am not sure what you mean. `dotTXT_fileContents` is just a variable which is using `textscan()` to read a file. –  Mar 17 '12 at 23:20

2 Answers2

1

The problem is with last 2 statements. When you do tmp_str{:} you convert cell array to comma-separated list of strings. If you assign this list to a single variable, only the first string is assigned. So the tmp_str will now have only the first row of data.

Here is what you can do instead of last 2 lines:

tmp_mat = cellfun(@str2num, tmp_str, 'uniformoutput',0);
data_matrix_grid_wise(w,:) = cell2mat(tmp_mat);

However, you will have a problem with concatenation (cell2mat) since not all of your rows have the same number of columns. It's depends on you how to solve it.

yuk
  • 19,098
  • 13
  • 68
  • 99
1

Problem in line tmp_str = tmp_str{:}; Matlab have strange behaviour when handling chars. Short solution for you is replace last with the next two lines:

y = cell2mat( cellfun(@(z) sscanf(z,'%f'),tmp_str,'UniformOutput',false));
data_matrix_grid_wise(w,:) = y;
Singlet
  • 317
  • 1
  • 4
  • 13
  • Singlet: How can I include the number before *'day'* i.e. 1,2,3 etc. on a line just before the data_starts(w)? I am using this within the loop but it doesn't seem to work: `days_str = dotTXT_fileContents(data_starts(w)-1); days_str = days_str{1} days(w,:) = sscanf(days_str(w-1,:), '%d %*s %*s', [1, inf]);` –  Apr 11 '12 at 03:18
  • If you want extract only integer number before 'day' word just use next code: `days_str = dotTXT_fileContents(data_starts(w)-1); days(w) = sscanf(days_str{1}, '%d');` I haven't idea about purpose (w-1,:) indexer in you code – Singlet Apr 11 '12 at 09:26
  • Oh, that `(w-1,:)` was a typo :). Thanks, btw!! –  Apr 11 '12 at 17:13
  • Singlet: In some of the data files (like the one I have given link to) have days with decimal numbers. In that case I am just getting the number before the decimal and no decimals. Do you know how to include the decimal part as well...like for example I want the day to store `2.2` instead of just `2`. Thanks. –  Apr 11 '12 at 19:14
  • 1
    Just change %d specifier in sscanf to %f for reading float point values. – Singlet Apr 12 '12 at 07:21