1

I'm trying to use str_locate_all to find the index of the third occurrence of '/' in a dplyr chain but it's not returning the correct index.

  ga.categoryViews.2016 <- ga.data %>%
    mutate(province = str_sub(pagePath,2,3),
           index = str_locate_all(pagePath, '/')[[1]][,"start"][3],
           category = str_sub(pagePath, 
                              str_locate_all(pagePath, '/')[[1]][,"start"][3] + 1,
                              ifelse(str_detect(pagePath,'\\?'), str_locate(pagePath, '\\?') - 1, str_length(pagePath))
                              )
             )

an example of what it's returning is

enter image description here

The first column is pagePath, the fourth is the index

It seems to be always returning an index of 12.

Any help is appreciated.

Thanks,

Joseph Noirre
  • 387
  • 4
  • 20

1 Answers1

3

You need to use rowwise(), i.e.

library(dplyr)
library(stringr)

df %>% 
 rowwise() %>% 
 mutate(new = str_locate_all(v1, '/')[[1]][,2][3])

Source: local data frame [2 x 2]
Groups: <by row>

# A tibble: 2 x 2
#                              v1   new
#                           <chr> <int>
#1 /on/srgsfsfs-gfdgdg/dfgsdfg-df    20
#2        /on/sgsddg-dfgsd/dfg-dg    17

DATA

x <- c('/on/srgsfsfs-gfdgdg/dfgsdfg-df', '/on/sgsddg-dfgsd/dfg-dg')
df <- data.frame(v1 = x, stringsAsFactors = F)

df
#                              v1
#1 /on/srgsfsfs-gfdgdg/dfgsdfg-df
#2        /on/sgsddg-dfgsd/dfg-dg
Sotos
  • 51,121
  • 6
  • 32
  • 66