I have a 9-column tab-delimited .txt file containing numerous data-formats - some entries are however empty within 'type
'.
id id_2 s1 s2 st1 st2 type desig num
1 1 51371 51434 52858 52939 5:3_4:4_6:2_4:4_2:6 CO 1
2 1 108814 108928 109735 110856 5:3_4:4_6:2_4:4_2:7 CO 2
3 1 130975 131303 131303 132066 5:3_4:4_6:2_4:4_2:8 NCO 3
4 1 191704 191755 194625 194803 NCO 4
5 2 69355 69616 69901 70006 CO 5
6 2 202580 202724 204536 205151 5:3_4:4_6:2_4:4 CO 6
Due to the mixed format types, i've been using textscan to import this data:
data = textscan(fid1, '%*f %f %f %f %f %f %*s %s %*[^\r\n]','HeaderLines',1);
To take columns 2-6, skip 'type
' and take the 8th column.
This approach fails on rows with empty entries - it skips this as if it was not a column and instead of taking 'NCO' or 'CO' it will take '4' or '5'.
Is there a way to prevent this? I know I could alter the original .txt files to include something like 'NA' for empty entries but this is less desirable than a more robust way to read such files.
EDIT:
In addition to the answer below, simply specifying the delimiter used appears to fix the issue:
data = textscan(fid1, '%*f %f %f %f %f %f %*s %s %*[^\r\n]','HeaderLines',1,'delimiter','\t');