I am currently working on a project where I am trying to calculate various data, however, the CSV file I am working with has an anomaly in the column. It contains a date in the format "%d/%m/%y" format followed immediately by a string.
This is repeated throughout the entire column (column is headerless just in case) and what I am currently trying to achieve is replace the date throughout the column with leaving the remaining string only.
My current approach is to use gsub function, which looks as follows:
gsub(".[/]|[/]|[[:digit:]].", " ", dataset column)
This seems to work initially, however when running a head command, it appears to apply this only for the first 6-7 fields and the rest are appearing as NA values.
Is there any limitations to the GSub function if I am working with a column of 3000+ entries or is there something wrong with the logic behind the code to achieve this.
Here is the sample data used for the code:
structure(list(V1 = c("3/3/2005Mitsubishi", "3/4/2006Jaguar",
"13/2/2007Land Rover", "12/12/2009Ferrari", "4/4/2008Jeep", "3/3/2005Honda"
), V2 = c("Mitsubish", "Jaguar", "Land Rover", "Ferrari", "Jeep",
"Honda")), row.names = c(NA, 6L), class = "data.frame")