conversion of cell to matrix that contains text?

Question

I have the text file that contains measurement data, the header is not that important so I used this to remove the first 25 lines

% Skip the first 25 lines
for i=1:25
fgetl(inputfile);
end

Then I used delimiter in order to get the data

   delimiter = '';
   values = textscan(inputfile, '%s', 'delimiter', delimiter);

I am trying to convert cell that consists of 1000 char as in the following. Here what I got

'2014_11_03_17-19-49 000 430114 516672 960.91 26.2'
'2014_11_03_17-19-49 001 430112 516656 960.91 26.2'
'2014_11_03_17-19-49 002 430112 516656 960.91 26.2'
'2014_11_03_17-19-49 003 430112 516656 960.91 26.2'

I am trying to convert cell that consists of 1000 char as in the previous I am concerning about (960.91 and 26.2) values only.

I tried to convert it to matrix but i got

this error Cannot support cell arrays containing cell arrays or objects.

Any idea how to just got those values into matrix to plot them.

chappjc · Answer 1 · 2014-11-04T18:36:58.500

1

Use a different format specifier with textscan the 'HeaderLines' option to skip the header directly:

>> fid = fopen('testtext.txt','r','HeaderLines',25)
>> C = textscan(fid,'%s %d %d %d %f %f')
C = 
    {4x1 cell}   [4x1 int32]   [4x1 int32]   [4x1 int32]   [4x1 double]   [4x1 double]
>> fclose(fid);
>> C{5}

  960.9100
  960.9100
  960.9100
  960.9100

Make a matrix by concatentation, if you want it like so:

>> M =[C{5} C{6}]
M =

  960.9100   26.2000
  960.9100   26.2000
  960.9100   26.2000
  960.9100   26.2000

You can even specify to ignore all but the outputs you want:

C = textscan(fid,'%*s %*d %*d %*d %f %*f')
C = 
    [4x1 double]

Don't forget to fclose(fid) or fclose all if you lost the handle.

edited Nov 04 '14 at 18:36

answered Nov 04 '14 at 18:00

chappjc

30,359
6
75
132

I am using .log file and I got that error Invalid machine format. – mecaeng Nov 05 '14 at 09:57
@mecaeng What command gave that, and what was the exact wording? – chappjc Nov 05 '14 at 17:26

Divakar · Accepted Answer · 2014-11-04T18:49:09.447

1

Approach #1

You can use the nifty importdata here -

lines_skip = 25;
values = importdata(inputfile,' ',lines_skip) %// using the delimiter ' ' here

values would be a struct holding the data from inputfile.

Thus, fourth column would be values.data(:,4), while values.data(:,5) would be the fifth one as shown here -

>> values.data(:,4)
ans =
  960.9100
  960.9100
  960.9100
  960.9100
>> values.data(:,5)
ans =
   26.2000
   26.2000
   26.2000
   26.2000

Approach #2

If you already have the cell array as listed in the question, you don't need to worry about reading the input file again. So, you have something like this -

incell = {
    '2014_11_03_17-19-49 000 430114 516672 960.91 26.2'
    '2014_11_03_17-19-49 001 430112 516656 960.91 26.2'
    '2014_11_03_17-19-49 002 430112 516656 960.91 26.2'
    '2014_11_03_17-19-49 003 430112 516656 960.91 26.2'}

Next, you can use cellfun with regexp to split each cell into columns using the delimiter ' ' -

cellarr = cellfun(@(x) regexp(x,' ','Split'),incell,'un',0)
values = vertcat(cellarr{:})

which will get you -

values = 
    '2014_11_03_17-19-49'    '000'    '430114'    '516672'    '960.91'    '26.2'
    '2014_11_03_17-19-49'    '001'    '430112'    '516656'    '960.91'    '26.2'
    '2014_11_03_17-19-49'    '002'    '430112'    '516656'    '960.91'    '26.2'
    '2014_11_03_17-19-49'    '003'    '430112'    '516656'    '960.91'    '26.2'

That is, fifth and sixth columns from values would be the data you were looking to have after wrapping str2double around them : str2double(values(:,5)) and str2double(values(:,6)).

edited Nov 04 '14 at 18:49

answered Nov 04 '14 at 18:33

Divakar

218,885
19
262
358

in case of these data in this format `'2014_11_03_17-19-49' '000' '430114' '516672' '960.91' '26.2' '2014_11_03_17-19-49' '001' '430112' '516656' '960.91' '26.2' '2014_11_03_17-19-49' '002' '430112' '516656' '960.91' '26.2'` and text for 2 line then `'2014_11_03_17-19-49' '000' '430114' '516672' '960.91' '26.2' '2014_11_03_17-19-49' '001' '430112' '516656' '960.91' '26.2' '2014_11_03_17-19-49' '002' '430112' '516656' '960.91' '26.2'` – mecaeng Nov 05 '14 at 14:12
how can i bypass these 2 line and get the data ? – mecaeng Nov 05 '14 at 14:12
@mecaeng Let's suppose `a1` is your `line 1`, so you have - `a1 = {'2014_11_03_17-19-49' '000' '430114' '516672' '960.91' '26.2' '2014_11_03_17-19-49' '001' '430112' '516656' '960.91' '26.2' '2014_11_03_17-19-49' '002' '430112' '516656' '960.91' '26.2'}`. Then do `data_a1 = reshape(a1,6,[])'`. Thus, in `data_a1` you would have many columns, out of which you need the fifth and sixth columns. So, then do `col5 = data_a1(:,5)` and finally get the numeric values with `col5_num = str2double(col5)`. See if this works. – Divakar Nov 05 '14 at 14:24
Let me be more clear : suppose that is my file : first 25 lines (just headers-not important ) then number of line as this format **'2014_11_03_17-19-49' '000' '430114' '516672' '960.91' '26.2'** then another 2 lines (headers-not important) and again the measured data **2014_11_03_17-19-49' '000' '430114' '516672' '960.91' '26.2** – mecaeng Nov 05 '14 at 14:31
by your approach that successfully obtained that date for for the first part .. the point here how to pass the new headers and get the data ... I have like 1000 headers in one file – mecaeng Nov 05 '14 at 14:33
@mecaeng How do we differentiate between headers and the numbered data? Like do the header text start with some letters like A or B or just anything, but not a digit? – Divakar Nov 05 '14 at 14:35
Nope. they start with the same (date) but headers are equally repeated .like 100 lines of data then 2 (line)headers then again the 100 lines of data and so on – mecaeng Nov 05 '14 at 14:38
Comments are not for extended discussion; this conversation has been [moved to chat](http://chat.stackoverflow.com/rooms/64326/discussion-on-answer-by-divakar-conversion-of-cell-to-matrix-that-contains-text). – Taryn Nov 05 '14 at 14:40

conversion of cell to matrix that contains text?

2 Answers2