Read multiple lines off a csv file

Question

I have a csv file which looks like the following when I open it in notebook...

val1,val2,val3,val4,val5,val6,val7,val8,val9,val10,val11,val12,val13,result 63,1,1,145,233,1,2,150,0,2.33,0,6,F
67,1,4,160,286,0,2,108,1,1.5,2,3,3,T
67,1,4,120,229,0,2,129,1,2.6,2,2,7,T
37,1,3,130,250,0,0,187,0,3.5,3,0,3,F

I would like to read this data into matlab and have found this question that really looks promising. My code for this implementation is follows...

fid = fopen(path);
out = textscan(fid,'%f%f%f%f%f%f%f%f%f%f%f%f%f','HeaderLines',1,'delimiter',',','CollectOutput',1);
fclose(fid);

However, this only seems to read in the first line into matlab. How can I get it to read in the whole file?

out{1}
ans =
      63.0000    1.0000    1.0000  145.0000  233.0000    1.0000    2.0000  150.0000         0    2.3000    3.0000         0    6.0000

score 1 · Answer 1 · answered Oct 15 '14 at 16:12

After banging my head on my desk for a while it hit me that the problem might be the fact that I haven't specified the result string in the format specifier. This is data that I don't need in my code and, therefore, I left it out. Adding and addition %s on the end allowed all the data to be read out.

Note for future: Specify all the fields in the format specifier and ignore them when coding.

The actual code should look like the following...

out = textscan(fid,'%f%f%f%f%f%f%f%f%f%f%f%f%f%s','HeaderLines',1,'delimiter',',','CollectOutput',1);

nkjt · Answer 2 · 2014-10-15T16:31:52.533

0

What often happens with textscan is that if you use the wrong specifier, or where there is something unexpected in the file is that textscan reads as much of the file as it can, then stops when it gets to something it can't parse properly. Unfortunately, it stops silently, without errors. Failure to read the full file, or output which appears to have stopped mid-way through a line, are common symptoms of this issue. If you don't need the strings, you can tell textscan to skip over them with *:

out = textscan(fid,''%f%f%f%f%f%f%f%f%f%f%f%f%f%*s','HeaderLines',1,'delimiter',',','CollectOutput',1);

It can be easier when constructing longer format specifiers to use repmat:

out = textscan(fid, [repmat('%f',[1,13]),'%*s'],'HeaderLines',1,'delimiter',',','CollectOutput',1);

edited Oct 15 '14 at 16:31

answered Oct 15 '14 at 16:18

nkjt

7,825
9
22
28

CSVread I don't think works because my file contains non-numeric values. – Marmstrong Oct 15 '14 at 16:21
I had it in my head that it was okay if you only read the numeric part, but apparently not (maybe that's `xlsread`). – nkjt Oct 15 '14 at 16:25

Read multiple lines off a csv file

2 Answers2