0

I'm trying to import a width delimited txt file using the textscan function. The file is 80 characters wide, with no delimiter, and the desired resulting 12 columns are different widths of characters. I have tried to do this by specifying the width of the string, (i.e 12 strings, each of a different width of characters that add up to 80) but as soon as there is a space (because certain values are missing) MATLAB interprets this as my delimiter and messes up the format.

data= textscan(fileID, '%5s %7s %1s %1s %1s %17s %12s %12s %10s %5s %6s %3s');

I can work around this using Excel but this seems like a bad solution. Is there any way of doing this using MATLAB, maybe a different function than textscan/make textscan forget delimiters and just deal with width of the string?

craigim
  • 3,884
  • 1
  • 22
  • 42
CeeGee
  • 13
  • 2

2 Answers2

1

You need to change the value of the delimiter and white space characters to empty:

format_string = '%5s %7s %1s %1s %1s %17s %12s %12s %10s %5s %6s %3s';
C = textscan(fid, format_string, 'delimiter', '', 'whitespace', '');

That way MATLAB will treat each character, including spaces, as valid characters.

craigim
  • 3,884
  • 1
  • 22
  • 42
0

Hmmm, I have experienced the same problem with textscan. Well, here is a long way around it (it is by no means the best solution, but it should work)

fid=fopen('txtfile.txt','rt'); %//load in file
a=fscanf(fid'%c');       %//scan the thing into chars
fclose(fid);

for r = 0:NumberOfRowsInUrData -1    %//Now the loop... Number of rows in your data can also be calculated by size(a,2)/20 
b(r+1,:) = a(1+20*r:20*(r+1)); %// this will correctly index everything 
end

The good thing is that now everything is in the matrix b, you can simply index your chars like string1 = b(:,1:5) and it will output everything in a nice matrix.

The downside ofc is the for loop, which I think you should be able to replace with something like a cellfun or something.

GameOfThrows
  • 4,510
  • 2
  • 27
  • 44
  • i'm quite new to matlab so i might be missing something obvious here, but it doesnt seem to work. b returns a 2x20 char , made up up of only the first 40 characters from the first row of data. care to explain how the loop is meant to work? (significance of 20 etc.) thanks in advance! – CeeGee Jul 02 '15 at 13:40
  • Well, the loop basically indexes the long Char sequence in a logical format. I used fid to be the upper limit which was probably a mistake (I used it for a simple test), basically, the upper limit is suppose to be the number of rows in your data. For each row of your data, index the 20 chars that corresponds to that row. – GameOfThrows Jul 02 '15 at 13:50
  • i changed 20 for 80 as i assume thats what you meant. once i do that, a strange result occurrs. b is the right dimensions (32x80 char) but only the first row looks like its meant to. http://www.tiikoni.com/tis/view/?id=45abcce this is what i mean – CeeGee Jul 02 '15 at 15:00
  • hmm, so it seems to be indenting by 1? weird, it works fine for me, have you tried to index it? i.e. b(:,75:80) to see what it looks like? – GameOfThrows Jul 02 '15 at 15:26
  • I think this might be Matlab presenting/displaying the value in a funny way, I do not seem to encounter the same problem in my test. If you change NumberOfRowsInUrData to 32, and change a(1+80*r:80*(r+1)), this should give the right answer of 32*80. – GameOfThrows Jul 02 '15 at 15:29