I am working on a text analysis with R and have a dataset (text corpus) with various sentences about different fruits. For example: "apple", "banana" , "orange", "pear", etc.
Since it is not relevant for the analysis whether someone writes about "apples" or "bananas", I want to replace all different fruits with one specific word, for example "allfruits".
I thought about using regex but I am facing two issues;
1) I want to avoid separate code lines for each kind of fruit. Thus, is there a way to define a list or a vector that I can use so that the function replaces all words in that list (apple, bananas, pear, etc.) with one specific word "allfruits"?
2) I want to avoid that words that are NOT a fruit but contain the same string as a fruit (e.g. the word "appletini) get replaced by the function.
Example: If I have a sentence that says: "Apple is my favourite fruit, appletini is my favourite drink. I also like bananas!" I want following to happen: allfruits is my favourite fruit, appletini is my favourite drink. I also like allfruits!
I am not sure whether it is possible to write this with a gsub function. Thus, all help is much appreciated.
Thank you!