If I have such a list of tuples:
train = [('pad thai', 'FOOD#QUALITY'),
('Ginger House', 'RESTAURANT#GENERAL'),
('fried dumplings', 'FOOD#QUALITY'),
('Chinese restaurant', 'RESTAURANT#GENERAL'),
('customer service', 'SERVICE#GENERAL'),
('management', 'SERVICE#GENERAL')]
I can use freq = nltk.ConditionalFreqDist((a, category) for a, category in train)
to get the frequencies of whole phrase in a category, but if I want to store just the frequencies of unigrams, how would I do this, preferably in a list comprehension? I have come across this solution: Remove uni-grams from a list of bi-grams which is helpful, but would like something more concise if possible.