Questions tagged [stringr]

The stringr package is a wrapper for the R stringi package that provides consistent function names and error handling for string manipulation. It is part of the Tidyverse collection of packages. Use this tag for questions involving the manipulation of strings specifically with the stringr package. For general R string manipulation questions use the R tag together with the generic string tag.

's stringr package provides a more consistent user interface to base-R's string manipulation and regular expression functions.

Repositories

Other resources

Related tags

2501 questions
8
votes
2 answers

In regex, mystery Error: assertion 'tree->num_tags == num_tags' failed in executing regexp: file 'tre-compile.c', line 634

Assume 900+ company names pasted together to form a regex pattern using the pipe separator -- "firm.pat". firm.pat <- str_c(firms$firm, collapse = "|") With a data frame called "bio" that has a large character variable (250 rows each with 100+…
lawyeR
  • 7,488
  • 5
  • 33
  • 63
8
votes
2 answers

R regex gsub separate letters and numbers

I have a string that's mixed letters and numbers: "The sample is 22mg" I'd like to split strings where a number is immediately followed by letter like this: "The sample is 22 mg" I've tried this: gsub('[0-9]+[[aA-zZ]]', '[0-9]+ [[aA-zZ]]', 'This…
screechOwl
  • 27,310
  • 61
  • 158
  • 267
7
votes
3 answers

Removing second and subsequent occurrences of decimal point in string

I want to remove second and subsequent occurrences of decimal point in string. My attempt is below: library(stringr) str_remove(string = "3.99-0.13", pattern = "\\.") [1] "399-0.13" sub("\\.", "", "3.99-0.13") [1] "399-0.13" However, I want the…
MYaseen208
  • 22,666
  • 37
  • 165
  • 309
7
votes
2 answers

How can I split a string "a\n\nb" into c("a", "", "", "b")?

When I stringr::str_split "\n\na\n" by "\n", I obtain c("", "", "a", ""). I expected I can obtain c("a", "", "", "b") when I stringr::str_split "a\n\nb" by "\n", but I obtained c("a", "", "b") instead. How can I obtain c("a", "", "", "b") by…
김용석
  • 71
  • 3
7
votes
5 answers

Select a string ending at the first instance of character in Regular Expressions

Say I have the following string: pos/S881.LMG1810.QE009562.mzML And wish to select the beginning from that string: pos/S881. I can use the following regex expression to get the start of the string (^), then any character (.), any number of time…
Henry Holm
  • 495
  • 3
  • 13
7
votes
6 answers

Count how many times strings from one data frame appear to another data frame in R dplyr

I have two data frames that look like this: df1 <- data.frame(reference=c("cat","dog")) print(df1) #> reference #> 1 cat #> 2 dog df2 <- data.frame(data=c("cat","car","catt","cart","dog","dog","pitbull")) print(df2) #> data #> 1 …
LDT
  • 2,856
  • 2
  • 15
  • 32
7
votes
1 answer

R package long time to install - Source or Binary type

Am trying to install a package called stringi using the below command install.packages("stringi") Though it doesn't throw any error message but the installation is not over yet. I see lot of messages in my console screen which keeps running for more…
The Great
  • 7,215
  • 7
  • 40
  • 128
7
votes
2 answers

Remove part of a string based on overlapping patterns

I have the following data: dat <- data.frame(x = c("this is my example text", "and here is my other text example", "my other text is short"), some_other_cols = c(1, 2, 2)) Further, I have the following vector of…
deschen
  • 10,012
  • 3
  • 27
  • 50
7
votes
1 answer

What's the difference between the str_detect function in stringer and grepl and grep?

I'm starting to do a lot of string matching in my work and I'm curious as to what the differences between the three functions are, and in what situations someone would use one over the other.
Jeffrey Brabec
  • 481
  • 6
  • 11
7
votes
8 answers

Getting the unique count of strings from a text string

I am wondering on how to get the unique number of characters from the text string. Let's say I am looking for a count of repetition of the words apples, bananas, pineapples, grapes in this string. A<- c('I have a lot of pineapples, apples and…
user3570187
  • 1,743
  • 3
  • 17
  • 34
7
votes
5 answers

Split & extract part of string (between a "." and digit) in R

I have a character variable (companies) with observations that look like this: "612. Grt. Am. Mgt. & Inv. 7.33" "77. Wickes 4.61" "265. Wang Labs 8.75" "9. CrossLand Savings 6.32" "228. JPS Textile Group 2.00" I'm trying to split these strings…
Nina
  • 73
  • 5
7
votes
2 answers

Efficient way to add numbers to alphanumeric strings in R

I have a data.frame with ids composed of sequences of alphanumeric characters (e.g., id = c(A001, A002, B013)). I was looking for an easy function under stringr or stirngi that would easily do math with this strings (id + 1 should return c(A002,…
Matias Andina
  • 4,029
  • 4
  • 26
  • 58
7
votes
5 answers

Split string every n characters new column

Suppose I have a data frame like this with a string vector, var2 var1 var2 1 abcdefghi 2 abcdefghijklmnop 3 abc 4 abcdefghijklmnopqrst What is the most efficient way to split var2 every n characters into new columns until the end…
Mikey
  • 165
  • 1
  • 14
7
votes
4 answers

How to replace an exact string using stringr functions?

I'm trying to replace exact strings in a column using stringr functions. The dataset I try it on is this: data <- data.frame( column = c("Value", "Values", "Value", "Values") ) data column 1 Value 2 Values 3 Value 4 Values I want to replace…
MetaPhilosopher
  • 131
  • 2
  • 9
7
votes
1 answer

Subset vector not containing word in piped operation in R (regex)

How do I subset a vector for elements that do not contain a word in a piped operation? (I'm really into piping) I'm hoping there's some way to invert str_subset. In the following example, I'd like to just return the second element of x instead of…
David Rubinger
  • 3,580
  • 1
  • 20
  • 29