I have a column in a data frame that is not well defined. For example, I have :
> mydata$column1<-c("abcAppledef","gsApple123hilhj","stBananaewfs","sfesBanana123sfeft",
"stwefPearsfet","stwfePearabcseft","wefCarwefeEef","wefwaCarWFEe","wefaCarEFWefe")
I would like to re-define the column by replacing the strings with wildcard, and the outcome should be something like:
> mydata$column1<-c("Apple","Apple","Banana","Banana","Pear","Pear","Car","Car","Car")
I am using
> mydata$column1<-gsub('.*Apple.*','Apple',mydata$column1)
> mydata$column1<-gsub('.*Banana.*','Banana',mydata$column1)
> mydata$column1<-gsub('.*Pear.*','Pear',mydata$column1)
> mydata$column1<-gsub('.*Car.*','Car',mydata$column1)
But I have many different kinds of patterns, and I would need to apply this on multiple tables as well. Is there a more efficient way to do this? Maybe a lookup table?
Thanks.