0

I have a huge .txt file and parts of which I want to parse (using text scan), say I have 10000 line data and a part which starts at line 300, the part also has a header of 10 lines say,how can I skip the first 300 lines(not using header function of text scan of course as I then wont be able to get my actual 10 line header) or is there a way in which I can jump to line 300 and start text scan from there as if 301 line was the first line.

maz
  • 3
  • 6
  • Do the different parts have different formats? So you're saying there's a part you want to scan from like 300 to line 10,000. And lines 300 to 310 are the header? How is the data formatted? – houtanb May 31 '17 at 13:18
  • no,say 300-340 is the part which has header from 300-310 ,there may be several such parts in 10,000 lines. – maz Jun 01 '17 at 08:55

1 Answers1

0

So, assuming your data is generated by the following (since you have not mentioned how it's formatted in your question):

fid = fopen('datafile.txt', 'w');

for i=1:300
    fprintf(fid, 'ignore%d\n', i);
end

for i=301:310
    fprintf(fid, 'header%d\n', i);
end

for i=311:10000
    fprintf(fid, '%d\n', i);
end

fclose(fid);

You can read the data by first using fgetl to advance to the 300th line, then using textscan twice to get the header info and the data. Basically, the thing to remember is that textscan works starting from the place where fid is pointing. So, if it's pointing to the 301st line, it'll start scanning from there. So, here's the code to read the file above, starting from line 301:

fid = fopen('datafile.txt', 'r');

for i=1:300
    fgetl(fid);
end

scannedHeader = textscan(fid, '%s', 10);
scannedData = textscan(fid, '%d');

fclose(fid);

NB: if the data is always the same format, you can use ftell to know where to skip to exactly then use fseek to go to that offset.

houtanb
  • 3,852
  • 20
  • 21
  • i have line numbers of where the part starts using text scan `textscan(fid, '%s', 'Delimiter', '\n')` i now want to jump to these lines directly ,how to jump to say 300th line directly without counting from the start always – maz Jun 01 '17 at 08:59
  • fseek wont help as my data is asymmetric and contains values and tables,i want something that can jump lines where lines are obtained from above mentioned code,your second code seems fine(i'm currently using the same idea) but my file is large ,millions of lines ,i cant use the for loop mentioned – maz Jun 01 '17 at 09:17
  • You can't. `fgetl` in a loop is the only way, as outlined above. – houtanb Jun 01 '17 at 09:59