Find the location of a character in string

Question

I would like to find the location of a character in a string.

Say: string = "the2quickbrownfoxeswere2tired"

I would like the function to return 4 and 24 -- the character location of the 2s in string.

Why use a regex? Doesn't r has an `.indexOf()` or something? — fge, Jan 10 '13 at 01:44
I doubt it. The developers were Nixers and assumed everyone knew regex. R's string handling is kind of kludgy. — IRTFM, Jan 10 '13 at 02:48

mnel · Accepted Answer · 2016-02-29T23:35:15.943

126

You can use gregexpr

 gregexpr(pattern ='2',"the2quickbrownfoxeswere2tired")


[[1]]
[1]  4 24
attr(,"match.length")
[1] 1 1
attr(,"useBytes")
[1] TRUE

or perhaps str_locate_all from package stringr which is a wrapper for ~~gregexpr~~ stringi::stri_locate_all (as of stringr version 1.0)

library(stringr)
str_locate_all(pattern ='2', "the2quickbrownfoxeswere2tired")

[[1]]
     start end
[1,]     4   4
[2,]    24  24

note that you could simply use stringi

library(stringi)
stri_locate_all(pattern = '2', "the2quickbrownfoxeswere2tired", fixed = TRUE)

Another option in base R would be something like

lapply(strsplit(x, ''), function(x) which(x == '2'))

should work (given a character vector x)

edited Feb 29 '16 at 23:35

answered Jan 10 '13 at 01:47

mnel

113,303
27
265
254

how can we extract the integers from the lists/objects returned by your first 3 solutions? – 3pitt Feb 14 '18 at 19:00
Use `regexpr` instead of `gregexpr` to get the integers easily. Or use `unlist` on the output as indicated in another answer below. – Arani Oct 09 '18 at 08:46

score 45 · Answer 2 · answered Oct 06 '13 at 23:16

45

Here's another straightforward alternative.

> which(strsplit(string, "")[[1]]=="2")
[1]  4 24

answered Oct 06 '13 at 23:16

Jilber Urbina

58,147
10
114
138

Can you explain what the `[[1]]` does? – francoiskroll Apr 01 '19 at 12:56
@francoiskroll, [[1]] represents the first element of the list. – Prafulla Apr 20 '19 at 08:49

score 22 · Answer 3 · answered Sep 27 '15 at 03:32

22

You can make the output just 4 and 24 using unlist:

unlist(gregexpr(pattern ='2',"the2quickbrownfoxeswere2tired"))
[1]  4 24

answered Sep 27 '15 at 03:32

score 4 · Answer 4 · answered Oct 08 '15 at 02:35

find the position of the nth occurrence of str2 in str1(same order of parameters as Oracle SQL INSTR), returns 0 if not found

instr <- function(str1,str2,startpos=1,n=1){
    aa=unlist(strsplit(substring(str1,startpos),str2))
    if(length(aa) < n+1 ) return(0);
    return(sum(nchar(aa[1:n])) + startpos+(n-1)*nchar(str2) )
}


instr('xxabcdefabdddfabx','ab')
[1] 3
instr('xxabcdefabdddfabx','ab',1,3)
[1] 15
instr('xxabcdefabdddfabx','xx',2,1)
[1] 0

score 2 · Answer 5 · answered Aug 12 '19 at 09:08

To only find the first locations, use lapply() with min():

my_string <- c("test1", "test1test1", "test1test1test1")

unlist(lapply(gregexpr(pattern = '1', my_string), min))
#> [1] 5 5 5

# or the readable tidyverse form
my_string %>%
  gregexpr(pattern = '1') %>%
  lapply(min) %>%
  unlist()
#> [1] 5 5 5

To only find the last locations, use lapply() with max():

unlist(lapply(gregexpr(pattern = '1', my_string), max))
#> [1]  5 10 15

# or the readable tidyverse form
my_string %>%
  gregexpr(pattern = '1') %>%
  lapply(max) %>%
  unlist()
#> [1]  5 10 15

score 2 · Answer 6 · answered Oct 03 '20 at 18:50

2

You could use grep as well:

grep('2', strsplit(string, '')[[1]])
#4 24

answered Oct 03 '20 at 18:50

AlexB

3,061
2
17
19

Find the location of a character in string

6 Answers6

Linked

Related