Questions tagged [stringr]

The stringr package is a wrapper for the R stringi package that provides consistent function names and error handling for string manipulation. It is part of the Tidyverse collection of packages. Use this tag for questions involving the manipulation of strings specifically with the stringr package. For general R string manipulation questions use the R tag together with the generic string tag.

r's stringr package provides a more consistent user interface to base-R's string manipulation and regular expression functions.

Repositories

Other resources

Posts on R-bloggers

Related tags

regex
r's stringi package

2501 questions

vote

1 answer

r split a column in a data frame based on square brackets

I have a data frame: x <- data.frame(a = letters[1:7], b = letters[2:8], c = c("bla bla [ text1 ]", "bla bla [text2]", "how how [text3 ]", "wow wow [ text4a ] [ text4b ]", "ba ba [ text5a ][ text5b]", "my text A", "my text B"),…

r regex stringr square-bracket

asked Aug 08 '17 at 19:19

user3245256

1,842
4
24
51

vote

0 answers

gsub seems to be replacing everything between the first and last character in of the pattern, rather than repeatedly replacing the pattern

In the code given below I want to replace all occurrences of "" with "link", but gsub isn't recognizing the repeated occurrences: stringx<- " word1 word1 word1 /n word word word word word word abcd" gsub('<{1}.*…

r regex gsub stringr

asked Aug 03 '17 at 01:16

Mrinalini Garg

vote

1 answer

R: simple keyword detection

I want to check if any of a set of "keywords" appear in a string. So, for "text" below, the result should be TRUE (or 1), and for text_2 it should be FALSE (or 0). keywords <- c("one", "two", "three", "four") #set of keywords text <- "Blah blah one…

r stringr grepl

asked Aug 02 '17 at 21:44

wimlouw

vote

1 answer

Extract youtube video ID from url with R stringr regex

I'm looking to extract only the video id string from a column of youtube links. The stringr function I'm currently using is this: str_extract(data$link, "\\b[^=]+$") This works for most standard youtube links with the id at the end of the url…

r regex stringr

asked Aug 01 '17 at 15:27

Paul Campbell

vote

3 answers

How to check if a string is made up entirely of certain string patterns

I have a vector of strings which I need to check to see if they fit a certain criteria. For example, if a certain string, say "34|40|65" is made up entirely of these patterns: c("34", "35", "37", "48", "65"), then I want to return 1, if they string…

r string stringr stringi

asked Jul 24 '17 at 13:51

cgibbs_10

vote

2 answers

Names string preparation for sex impute

I'm new at R and I need to prepare a column of names and then impute sex, but I'm having some problems with the preparation of the strings, specifically this is an example of what I have: Name example: "alberto eduardo etchegaray de la cerda…

r regex string stringr

asked Jul 21 '17 at 15:09

Pedro López

vote

1 answer

Why am I unable to install the R package stringi?

Problem installing stringi package during R library installation. During the installation of the package, I get an error when I connect to the URL and receive "icudt551.zip". However, the current situation is that if you have the file "icudt551.zip"…

r stringr stringi rhadoop

asked Jul 19 '17 at 13:26

user8331724

vote

2 answers

Finding Abbreviations in Data with R

In my data (which is text), there are abbreviations. Is there any functions or code that search for abbreviations in text? For example, detecting 3-4-5 capital letter abbreviations and letting me count how often they happen. Much appreciated!

r regex tidyr stringr tidytext

asked Jun 13 '17 at 18:20

Alex

vote

1 answer

stringr str_locate_all not returning the proper index in a dplyr string

I'm trying to use str_locate_all to find the index of the third occurrence of '/' in a dplyr chain but it's not returning the correct index. ga.categoryViews.2016 <- ga.data %>% mutate(province = str_sub(pagePath,2,3), index =…

r dplyr stringr

asked Jun 08 '17 at 14:18

Joseph Noirre

vote

1 answer

R: Extracting string if it is an element of a list

I want to dummy-code whether some string is contained in another (which is structured). For example: player <- c("Michael Jordan", "Steve Kerr", "Michael Jordan", "Toni Kukoc") bulls <- c("Jordan, Michael Jeffrey", "Pippen, Scottie; Harper, Ron", …

r string stringr

asked Jun 05 '17 at 20:00

user6550364

vote

1 answer

TextMining in R - Extracting 2 gram for only few terms and 1 gram for rest

text = c('the nurse was extremely helpful', 'she was truly a gem','helping', 'no issue', 'not bad') I want to extract 1-gram token for most words and 2 gram tokens for words such as extremely, no , not For example when I get tokens they should be as…

r tm stringr rweka

asked May 17 '17 at 11:32

MysticRenge

vote

1 answer

Alphabet conversion - Cyrillic to Latin

I have a list of names and surnames written on Cyrillic. head(text, n = 20) unique(clients$RODITEL) 1 2 ЃОРЃИ 3 ALEKSANDAR 4 000000000000 5 ТР4АЈЧЕ 6 …

r text-mining stringr

asked May 09 '17 at 12:28

Prometheus

1,977
3
30
57

vote

3 answers

get last part of a string

I would like to get the last substring of a variable (the last part after the underscore), in this case: "myvar". x = "string__subvar1__subvar2__subvar3__myvar" my attempts result in a match starting from the first substring, e.g.…

r regex stringr

asked May 08 '17 at 15:08

Henk

3,634
5
28
54

vote

2 answers

Extract segment of filename

I'm trying to extract a filename and save the dataframe with that same name. The problem I have is that if the filename for some reason is inside a folder with a similar word, stringr will return that word as well. filename <-…

r stringr

asked Apr 30 '17 at 10:42

FilipeTeixeira

1,100
2
9
29

vote

2 answers

Extracting multiple strings from poorly defined user input data

I am looking to create a lookup table from data where entries in a column (user_entry) are in different formats and may contain more than one instance per row. # create example dataframe. id <- c(1111,1112,1113,1114) user_entry <-…

r regex stringr

asked Apr 18 '17 at 15:30

lapsel

Prev 1 2 3

…

99 100 Next