Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
3
votes
1 answer

How to undo strsplit to put multiple characters into one

Let's say I have a string of words txt = "The licenses for most software" length(txt) 1 I can use strsplit to split it into its composite words t = unlist(strsplit(txt, split=" ")) length(t) 5 Now I want to undo what I did. How can I reconnect 5…
wen
  • 1,875
  • 4
  • 26
  • 43
3
votes
2 answers

expand.grid when one variable is really two columns

I have a data set with districts, counties and years. If a given district/county combination occurs in any year I want that combination to occur in every year. Below are two ways I have figured out to do this. The first approach uses a function…
Mark Miller
  • 12,483
  • 23
  • 78
  • 132
3
votes
4 answers

removing certain pattern from a string

I have a vector like below: t <- c("8466 W Peoria Ave", "4250 W Anthem Way", .....) I want to convert it into: t_mod <-c("Peoria Ave", "Anthem Way".....) That is I want to remove numbers and single characters from my vector of strings. Any help…
Ayush Raj Singh
  • 863
  • 5
  • 16
  • 20
3
votes
2 answers

R: I have to do Softmatch in String

I have to do softmatch in one column of data frame with the given input string, like col <- c("John Collingson","J Collingson","Dummy Name1","Dummy Name2") inputText <- "J Collingson" #Vice-Versa inputText <- "John Collingson" I want to retrieve…
user_az
  • 363
  • 2
  • 3
  • 17
3
votes
1 answer

Octave vectorize strsplit return value into separate variables

I have a file with a list of records which I parse one line at a time. Each record is newline delimited and each value is space delimited. This is just a simplified example, but it has a similar structure to the real data. Bob blue pizza Sally red…
Charity Leschinski
  • 2,886
  • 2
  • 23
  • 40
3
votes
1 answer

strsplit by variable separator

I have some strings of data separated by " " that needs to be split into columns. Is there an easy way to split the data by every nth separator. For example, the first value in x tells you that the first 4 values in y correspond to the first trial.…
jose
  • 103
  • 5
3
votes
3 answers

R: splitting a numeric string

I'm trying to split a numeric string of 40 digits (ie. splitting 123456789123456789123456789 into 1 2 3 4 etc.) Unfortunately strsplit doesn't work as it requires characters, and converting the string using as.character doesn't work as it is very…
rvrvrv
  • 881
  • 3
  • 9
  • 29
2
votes
1 answer

split string line by line and variablize i.e assign it to GITHUB_OUTPUT - workflow

Github action run invokes a powershell that returns as below: Powershell return function: return "$($psitem.Key)=$($psitem.Value)" The return is assigned to a github action variable returnvalue The return contains a list of key=value pairs…
Ashar
  • 2,942
  • 10
  • 58
  • 122
2
votes
4 answers

How to split a comma and colon separated column into respective columns in R?

Say for example I have a column that looks something like: name:Michael,Age:31,City:NYC How could I split this column into separate columns such that it would yield a result similar as a data frame to: name | Age | City 1 Michael | 31 |…
minimalbob
  • 51
  • 4
2
votes
3 answers

How to split characters in R while keeping the contents inside the brackets?

I have some amino acid modifications, something like: example <- c('_(Acetyl (Protein N-term))DDDIAAM(Oxidation (M))CK_') I would like to split such a sequence into a state similar to the following: example2 <- c('_','(Acetyl (Protein…
leelee
  • 77
  • 3
2
votes
2 answers

Get a vector which contains the string before one specific character from each element in a dataframe column

I have a dataframe that looks like this: df = data.frame(V1= c(1:3), V2 = c("abc-1-10", "def-2-19", "ghi-3-937")) I would like to be able to add a column V3 to this dataframe such that the column V3 contains the string before the first hyphen ("-")…
ag14
  • 867
  • 1
  • 8
  • 15
2
votes
3 answers

Split R string into individual characters

I think this should be simple, but I can't find another example that works for my purposes. I have many DNA sequences in 1 column in R, but I would like to split them into many columns with 1 base pair per column. For…
Doda
  • 285
  • 1
  • 9
2
votes
4 answers

A way to strsplit and replace all of one character with several variations of alternate strings?

I am sure there is a simple solution and I am just getting too frustrated to work through it but here is the issue, simplified: I have a string, ex: AB^AB^AB^^BAAA^^BABA^ I want to replace the ^s (so, 7 characters in the string), but iterate through…
2
votes
2 answers

R: Using STRSPLIT and GREP on vector elements on large dataset takes too long

(My first StackFlow question) My goal is to improve the ETL process for identifying which NetApp file shares are related to which AD permission distribution groups. Currently an application named 'TreeSize' scans a number of volumes and outputs a…
2
votes
2 answers

RegEx for matching all commas unless they are enclosed between parentheses or brackets

Consider the following code in R: x <- "A, B (C, D, E), F, G [H, I, J], K (L (M, N), O), P (Q (R, S (T, U)))" strsplit(x, split = "some regex here") I would like this to return something resembling a list containing the character vector "A" "B (C,…
Clarinetist
  • 1,097
  • 18
  • 46