1

I have a vector of strings which I need to check to see if they fit a certain criteria. For example, if a certain string, say "34|40|65" is made up entirely of these patterns: c("34", "35", "37", "48", "65"), then I want to return 1, if they string does not contain any of these patterns, then I want to return -1. If the string contains some patterns, but is not totally made up of these patterns, then I want to return 0. I have successfully achieved 1 and -1, but am having trouble with the logic which would yield 0. As stands, my logic yields 1 for those strings which should yield 0. Here is my code to determine if the string contains one of these patterns. This would give me the 1s.

acds <- c("34", "35", "37", "48", "65")
grepl(paste(acds, collapse = "|"), data$comp_cd)

data$comp_cd is the vector of strings

Thanks!

Sotos
  • 51,121
  • 6
  • 32
  • 66
cgibbs_10
  • 176
  • 1
  • 12

3 Answers3

1

Try: (Sorry overlooked the -1 part)

acds <- c("34", "35", "37", "48", "65")

# example-vector:
vec <- c("34|35|37", "34|23|99", "65|37|48", "11|22|33", "34a|35a|37a")

# want
res <- vector("numeric", length(vec))
for (i in 1:length(vec)) {
  comp.vec <- unlist(strsplit(vec[i],"[|]"))
  nr.matches <- sum(comp.vec %in% acds)
  res[i] <- ifelse(nr.matches == length(comp.vec), 1,
                   ifelse(nr.matches == 0, -1, 0))
}
print(res)
r.user.05apr
  • 5,356
  • 3
  • 22
  • 39
0

You can check the matches with:

sapply(strsplit(string,"\\|"), function(x) x %in% patterns)

You can easily wrap this in a function to give the numerical result as requested.

checkstring <-function(string,patterns)
{
  matches = sapply(strsplit(string,"\\|"), function(x) x %in% patterns)
  if(sum(matches)==length(matches))
    return(1)
  if(sum(matches)==0)
    return(-1)
  else
    return(0)
}

Example of usage:

checkstring("34a|65a",patterns=patterns)
[1] -1
checkstring("34|65",patterns=patterns)
[1] 1
checkstring("34|40|65",patterns=patterns)
[1] 0

Hope this helps!

Florian
  • 24,425
  • 4
  • 49
  • 80
  • There was a small issue with the function when trying to apply it to a vector of strings, but editing the sum() conditions with a lapply to account for the list fixed the issue. It gave me the results I was looking for. Thank you so much. – cgibbs_10 Jul 24 '17 at 16:03
  • 1
    Great, glad we could help! Please consider accepting one of the answers if your question was answered. – Florian Jul 24 '17 at 16:15
0

You can use intersect to get this, i.e.

f1 <- function(vec, pattern){
  v1 <- strsplit(pattern, '|', fixed = TRUE)[[1]]
  ind <- intersect(v1, vec)
  if(length(ind) == 0){
    return(-1)
  } else if(length(ind) == length(v1)) {
    return(1)
    }else return(0)
}

acds <- c("34", "35", "37", "48", "65")
x <- '34|40|65'

f1(acds, x)
#[1] 0
Sotos
  • 51,121
  • 6
  • 32
  • 66