extracting the second last word between the special characters "/"

Question

I would like to extract the second last string after the '/' symbol. For example,

url<- c('https://example.com/names/ani/digitalcod-org','https://example.com/names/bmc/ambulancecod.org' )
df<- data.frame (url)

I want to extract the second word from the last between the two // and would like to get the words 'ani' and 'bmc'

so, I tried this

 library(stringr)
 df$name<- word(df$url,-2)

I need output which as follows:

name 
ani
bmc

score 5 · Accepted Answer · answered Jan 31 '19 at 13:47

5

You can use word but you need to specify the separator,

library(stringr)

word(url, -2, sep = '/')
#[1] "ani" "bmc"

answered Jan 31 '19 at 13:47

Sotos

51,121
6
32
66

There should be more efficient ways. I was just continuing your train of thought – Sotos Jan 31 '19 at 13:54

score 1 · Answer 2 · answered Jan 31 '19 at 13:45

1

Try this:

as.data.frame(sapply(str_extract_all(df$url,"\\w{2,}(?=\\/)"),"["))[3,]
#   V1  V2
#3 ani bmc
  as.data.frame(sapply(str_extract_all(df$url,"\\w{2,}(?=\\/)"),"["))[2:3,]
#   V1    V2
#2 names names
#3   ani   bmc

answered Jan 31 '19 at 13:45

NelsonGon

13,015
7
27
57

markus · Answer 3 · 2019-01-31T14:21:25.107

0

A non-regex approach using basename

basename(mapply(sub, pattern = basename(url), replacement = "", x = url, fixed = TRUE))
#[1] "ani" "bmc"

basename(url) "removes all of the path up to and including the last path separator (if any)" and returns

[1] "digitalcod-org"   "ambulancecod.org"

use mapply to replace this outcome for every element in url by "" and call basename again.

edited Jan 31 '19 at 14:21

answered Jan 31 '19 at 13:47

markus

25,843
5
39
58

score 0 · Answer 4 · answered Jan 31 '19 at 13:48

0

Use gsub with

.*?([^/]+)/[^/]+$

In R:

urls <- c('https://example.com/names/ani/digitalcod-org','https://example.com/names/bmc/ambulancecod.org' )
gsub(".*?([^/]+)/[^/]+$", "\\1", urls)

This yields

[1] "ani" "bmc"

See a demo on regex101.com.

answered Jan 31 '19 at 13:48

Jan

42,290
8
54
79

score 0 · Answer 5 · answered Jan 31 '19 at 14:10

0

Here is a solution using strsplit

words <- strsplit(url, '/')
L <- lengths(words)
vapply(seq_along(words), function (k) words[[k]][L[k]-1], character(1))    
# [1] "ani" "bmc"

answered Jan 31 '19 at 14:10

niko

5,253
1
12
32

extracting the second last word between the special characters "/"

5 Answers5