0

I have a data table of 10,000 records having multiple columns. Below is the code and part of the data set

states <- str_trim(unlist(strsplit(as.vector(search_data_set$location_name), ";"))

Part of Dataset:

Maine Virginia; 
Oklahoma; 
Kansas Minnesota South Dakota; 
Delaware; 
West Virginia; 
Utah South Carolina; 
Utah South Dakota Utah; 
Indiana; Michigan Alaska Washington; 
Washington Connecticut Maine; 
Maine Oregon South Carolina Oregon; 
Alabama Alaska; 
Iowa Alabama New Mexico; 
Virgin Islands South Dakota; 
Maine Louisiana; Colorado; 
District of Columbia Virgin Islands; 
Pennsylvania Alabama;

I need to fulfill the below requirement and need help here:

  1. Each record should take a unique value of location. (In Utah South Dakota Utah; , Utah should be counted as Unique)
  2. When the user searches the dataset it should bring the record, if the location is anywhere. (%Oregon%) The current code is not bringing the record "Maine Oregon South Carolina Oregon;" when the user searches for "Oregon"

Need help in achieving this. Thanks in advance!

Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • 3
    Can you show `dput` of example – akrun Apr 19 '21 at 20:06
  • `%Oregon%` seems like VBA joker-symbols. This is not valid in R however. You could use `.*` or `(.*)?` if you want to match "any length of string with something". So Like `grepl('(.*)?Oregon(.*)?', data)` would match any value in `data` containing `Oregon` at any point in each string. You could do it a bit simpler by just removing it all together and use `grepl("Oregon", data)` however. – Oliver Apr 19 '21 at 20:10
  • There is `%like%` option in `R` – akrun Apr 19 '21 at 20:17
  • Not in the documentation of `help("regex")`, which i based my comment on. :-) – Oliver Apr 20 '21 at 15:46

0 Answers0