Questions tagged [stringr]

The stringr package is a wrapper for the R stringi package that provides consistent function names and error handling for string manipulation. It is part of the Tidyverse collection of packages. Use this tag for questions involving the manipulation of strings specifically with the stringr package. For general R string manipulation questions use the R tag together with the generic string tag.

's stringr package provides a more consistent user interface to base-R's string manipulation and regular expression functions.

Repositories

Other resources

Related tags

2501 questions
11
votes
4 answers

Extract last 4-digit number from a series in R using stringr

I would like to flatten lists extracted from HTML tables. A minimal working example is presented below. The example depends on the stringr package in R. The first example exhibits the desired behavior. years <- c("2005-",…
Daniel
  • 691
  • 2
  • 8
  • 19
10
votes
6 answers

Matching words from vectors of strings in R

I'm trying to clean up a database by matching a messy list of site names with an approved list. As an example, the preferred site name might be 'Cotswold Water Park Pit 28' but the site has been entered into the database as: 'Pit 28', '28', 'CWP Pit…
James
  • 1,164
  • 2
  • 15
  • 36
10
votes
4 answers

How to abbreviate a string in R

I need to abbreviate department names by their first character, so that strDept="Department of Justice" becomes strDeptAbbr = "DoJ". How can I abbreviate a string using stringr? Thank you
IVIM
  • 2,167
  • 1
  • 15
  • 41
10
votes
8 answers

Remove double quote \" symbol from string

I need to remove \" from a vector. This is my data: data <- c("\"https://click.linksynergy.com/link?id=RUxZriH*PWc&offerid=323058.1803224&type=2&murl=https%3A%2F%2Fwww.udemy.com%2Flinux-linux-security-masterclass-3-in-1%2F",…
antecessor
  • 2,688
  • 6
  • 29
  • 61
10
votes
5 answers

Using dplyr and stringr to replace all values starts with

my df > df <- data.frame(food = c("fruit banana", "fruit apple", "fruit grape", "bread", "meat"), sold = rnorm(5, 100)) > df food sold 1 fruit banana 99.47171 2 fruit apple 99.40878 3 fruit grape 99.28727 4 bread …
Tomas Ericsson
  • 347
  • 1
  • 2
  • 10
10
votes
3 answers

stringr, str_extract: how to do positive lookbehind?

Very simple problem. I just need to capture some strings using a regex positive lookbehind, but I don't see a way to do it. Here's an example, suppose I have some strings: library(stringr) myStrings <- c("MFG: acme", "something else", "MFG:…
Angelo
  • 2,936
  • 5
  • 29
  • 44
10
votes
2 answers

Extract last word in a string after comma if there are multiple words else the first word

I have data where the words as follows location<- c("xyz, sss, New Zealand", "USA", "Pris,France") id<- c(1,2,3) df<-data.frame(location,id) I would like to extract the country name from the data. The tricky part is if i extract just the last…
user3570187
  • 1,743
  • 3
  • 17
  • 34
10
votes
6 answers

parsing html containing   (non-breaking space)

I am using rvest to parse a website. I'm hitting a wall with these little non-breaking spaces. How does one remove the whitespace that is created by the   element in a parsed html document? library("rvest") library("stringr") minimal <-…
AndrewMacDonald
  • 2,870
  • 1
  • 18
  • 31
10
votes
2 answers

Removing Two Characters From A String

Related question here. So I have a character vector with currency values that contain both dollar signs and commas. However, I want to try and remove both the commas and dollar signs in the same step. This removes dollar signs = d = c("$0.00",…
ATMathew
  • 12,566
  • 26
  • 69
  • 76
9
votes
3 answers

How do I extract a file/folder_name only from a path?

Unfortunately I suck at regexp. If I have a path like so: /long/path/to/file, I just need to extact file. If someone supplies file/ I just need file. If someone supplies /file/, I still need just file. I've been using stringr functions as a crutch…
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
9
votes
2 answers

dplyr filter condition to distinguish between unicode symbol and its unicode representation

I am trying to filter the Symbol column based on whether it's of the form \uxxxx This is easy visually, that is, some look like $, ¢, £, and others like \u058f, \u060b, \u07fe. But I cannot seem to figure it out using stringi /…
stevec
  • 41,291
  • 27
  • 223
  • 311
9
votes
5 answers

Regex to remove leading zeros in R, unless the final (or only) character is zero

gsub("(? [1] "5" "AB" "" "" gsub("(^|[^0-9])0+", "\\1", c("005", "0AB", "000", "0"), perl = TRUE) #> [1] "5" "AB" "" "" The regular expression above is from this SO thread…
Display name
  • 4,153
  • 5
  • 27
  • 75
9
votes
3 answers

R regex - extract words beginning with @ symbol

I'm trying to extract twitter handles from tweets using R's stringr package. For example, suppose I want to get all words in a vector that begin with "A". I can do this like so library(stringr) # Get all words that begin with…
Ben
  • 20,038
  • 30
  • 112
  • 189
9
votes
1 answer

">" is not matched by "[[:punct:]]" when using `stringr::str_replace_all`?

I find this really odd : pattern <- "[[:punct:][:digit:][:space:]]+" string <- "a . , > 1 b" gsub(pattern, " ", string) # [1] "a b" library(stringr) str_replace_all(string, pattern, " ") # [1] "a > b" str_replace_all(string,…
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
9
votes
3 answers

Use stringr in R to find the remaining string after last substring

How can I use str_match to extract the remaining string after the last substring. For example, for the string "apples and oranges and bananas with cream", I'd like to extract the remainder of this string after the last occurrence of " and " to…
James N
  • 315
  • 2
  • 9