The text file abc.txt
is an arbitrary article that has been scraped from the web. For example, it is as follows:
His name is "Donald" and he likes burger. On December 11, he married.
I want to extract only words in lower case and numbers except for all kinds of periods and quotes in the above article. In the case of the above example:
{his, name, is, Donald, and, he, likes, burger, on, December, 11, he, married}
My code is as follows:
filename = 'abc.txt';
fileID = fopen(filename,'r');
C = textscan(fileID,'%s','delimiter',{',','.',':',';','"','''});
fclose(fileID);
Cstr = C{:};
Cstr = Cstr(~cellfun('isempty',Cstr));
Is there any simple code to extract only alphabet words and numbers except all symbols?