I have a dataset with species name where some names originally used are now obsolete, so they are noted "old_species***retired*** use new_species", whereas correct cells are just noted "new_species". Here is a sample of the data :
df<- data.frame(species=c("Etheostoma spectabile","Ictalurus furcatus","Micropterus salmoides","Micropterus salmoides","Ictalurus punctatus","Ictalurus punctatus","Ictalurus punctatus","Micropterus salmoides","Etheostoma olmstedi","Noturus insignis","Lepomis auritus","Lepomis auritus","Nocomis leptocephalus","Scartomyzon rupiscartes***retired***use Moxostoma rupiscartes","Lepomis cyanellus","Notropis chlorocephalus","Scartomyzon cervinus***retired***use Moxostoma cervinum","Ictalurus punctatus","Lythrurus ardens","Moxostoma pappillosum","Micropterus salmoides","Micropterus salmoides","Ictalurus punctatus"))
I have tried
sapply(strsplit(df$species, split='***retired***use', fixed = T),function(x) (x[2])))
but the cells for which the data is correct returns NA because they do not contain the split. Is there a way to make the split just for the cells actually containing it?