0

I have the following data string

    Seat_WASHER<-
  structure(
    list(
      Description = c(
        "SEAT WASHER, MR2, 8\", TN 10.12, CR 150/600, 316 Stainless Steel",
        "SEAT WASHER, 1\", TN 1.42, CR 950/1200, MR1, 316 Stainless Steel",
        "SEAT WASHER, 3\", TN 1.52,  MR1, 316 Stainless Steel",
        "SEAT WASHER, MR1, 2\", TN 1.62, CR 800/1200, 316 Stainless Steel",
        "SEAT WASHER, MR1, TN 2.12, 1/2\", CR 150/600, 316 Stainless Steel",
        "SEAT WASHER, MR6, 2\", TN 6.48, CR 750/100, 316 Stainless Steel"
      )
    ),
    row.names = c(NA,-7L),
    class = c("tbl_df", "tbl", "data.frame")
  )

It's a very large data set and is not consistent in it's order or contents with strings.

How do I find key indicators (", CR, MR), and pull all data between the delimiters into a column? If it can't find the key indicator in the string it'll need to output NULL.

Finding all CR will result in a column like:

Col 1 
--------
CR 150/600
CR 950/1200
NULL
CR 800/1200
CR 150/600
CR 750/100
DShad33
  • 3
  • 2

2 Answers2

1

You can try

library(stringr)

Seat_WASHER$col1 <- str_extract(Seat_WASHER$Description , "CR \\d+/\\d+")
  • output
         col1
1  CR 150/600
2 CR 950/1200
3        <NA>
4 CR 800/1200
5  CR 150/600
6  CR 750/100
Mohamed Desouky
  • 4,340
  • 2
  • 4
  • 19
  • If you want to add another columns , just change the pattern in the function `str_extract` like `Seat_WASHER$col2 <- str_extract(Seat_WASHER$Description , "TN \\d+\\.\\d+") ` to extract all TN results – Mohamed Desouky Aug 02 '22 at 20:05
0

If it is always split by a comma you can use strsplit to separate the string then find where CR is located using grep(), specify value = TRUE to return the value. I added trimws to remove the leading space.

m1 <- "SEAT WASHER, MR6, 2\", TN 6.48, CR 750/100, 316 Stainless Steel"
m2 <- strsplit(m1,",") 
trimws(grep("CR",m2[[1]], value = TRUE))

edit based on data

Still will string split and then keep where CR is inm3 before appending to data turn all length 0 vectors to NA

m2 <-   strsplit(Seat_WASHER$Description,",") 
m3 <- sapply(m2, function(x) trimws(grep("CR",x, value = TRUE)))

Seat_WASHER$newcol <- sapply(m3, function(x) if(identical(x, character(0))) NA_character_ else x)
Mike
  • 3,797
  • 1
  • 11
  • 30