I'm working on an assignment where I have to read a tab delimited text file and my output has to be a matlab structure.
The contents of the file look like this (It is a bit messy but you get the picture). The actual file contains 500 genes (the rows starting at Analyte 1) and 204 samples (the columns starting at A2)
#1.2
500 204
Name Desc A2 B2 C2 D2 E2 F2 G2 H2
Analyte 1 Analyte 1 978 903 1060 786 736 649 657 733.5
Analyte 2 Analyte 2 995 921 995.5 840 864.5 757 739 852
Analyte 3 Analyte 3 1445.5 1556.5 1579 1147.5 1249 1069.5 1048 1235
Analyte 4 Analyte 4 1550 1371 1449 1127 1196 1337 1167 1359
Analyte 5 Analyte 5 2074 1776 1960 1653 1544 1464 1338 1706
Analyte 6 Analyte 6 2667 2416.5 2601 2257 2258 2144 2173.5 2348
Analyte 7 Analyte 7 3381.5 3013.5 3353 3099.5 2763 2692 2774 2995
My code is as follows:
fid = fopen('gene_expr_500x204.gct', 'r');%Open the given file
% Skip the first line and determine the number or rows and number of samples
dims = textscan(fid, '%d', 2, 'HeaderLines', 1);
ncols = dims{1}(2);
% Now read the variable names
varnames = textscan(fid, '%s', 2 + ncols);
varnames = varnames{1};
% Now create the format spec for your data (2 strings and the rest floats)
spec = ['%s%s', repmat('%f', [1 ncols])];
% Read in all of the data using this custom format specifier. The delimiter will be a tab
data = textscan(fid, spec, 'Delimiter', '\t');
% Place the data into a struct where the variable names are the fieldnames
ge = data{3:ncols+2}
S = struct('gn', data{1}, 'gd', data{2}, 'sid', {varnames});
The part about ge is my current attempt but its not really working. Any help would be very appreciated, thank you in advance!!