Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
13
votes
1 answer

R: strsplit on backslash (\)

I am trying to extract the part of the string before the first backslash but I can't seem to get it tot work properly. I have tried multiple ways of getting it to work, based on the manual page for strsplit and after searching online. In my actual…
Bert Neef
  • 743
  • 1
  • 7
  • 14
13
votes
2 answers

Split text based on dot in R

I have: "word1.word2" and I want: "word1" "word2" I know I have to use strsplit with perl=TRUE, but I can't find the regular expression for a period (to feed to the split argument).
Antoine
  • 1,649
  • 4
  • 23
  • 50
13
votes
1 answer

strsplit inconsistent with gregexpr

A comment on my answer to this question which should give the desired result using strsplit does not, even though it seems to correctly match the first and last commas in a character vector. This can be proved using gregexpr and regmatches. So why…
Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
12
votes
3 answers

Splitting a string by space except when contained within quotes

I've been trying to split a space delimited string with double-quotes in R for some time but without success. An example of a string is as follows: rainfall snowfall "Channel storage" "Rivulet storage" It's important for us because these are column…
downtowater
  • 145
  • 1
  • 6
10
votes
4 answers

Dataframe from a character vector where variable name and its data were stored jointly

I've this situation: foo <- data.frame("vars" = c("animal: mouse | wks: 12 | site: cage | PI: 78", "animal: dog | wks: 32 | GI: 0.2", "animal: cat | wks: 8 | site: wild | PI: 13")) where…
Borexino
  • 802
  • 8
  • 26
10
votes
2 answers

Split a string by a plus sign (+) character

I have a string in a data frame as: "(1)+(2)" I want to split with delimiter "+" such that I get one element as (1) and other as (2), hence preserving the parentheses. I used strsplit but it does not preserve the parenthesis.
Vasista B
  • 329
  • 1
  • 2
  • 10
10
votes
1 answer

strsplit with vertical bar (pipe)

Here, > r<-c("AAandBB", "BBandCC") > strsplit(as.character(r),'and') [[1]] [1] "AA" "BB" [[2]] [1] "BB" "CC" Working well, but > r<-c("AA|andBB", "BB|andCC") > strsplit(as.character(r),'|and') [[1]] [1] "A" "A" "|" "" "B" "B" [[2]] [1] "B" "B"…
ramesh
  • 1,187
  • 7
  • 19
  • 42
10
votes
5 answers

strsplit by row and distribute results by column in data.frame

So I have the data.frame dat = data.frame(x = c('Sir Lancelot the Brave', 'King Arthur', 'The Black Knight', 'The Rabbit'), stringsAsFactors=F) > dat x 1 Sir Lancelot the Brave 2 King…
dmvianna
  • 15,088
  • 18
  • 77
  • 106
9
votes
3 answers

Split column by multiple delimiters, keeping delimiters

How can I split a character column into 3 columns using %, -, and + as the possible delimiters, keeping the delimiters in the new columns? Example Data: data <- data.table(x=c("92.1%+100-200","90.4%-1000+200", "92.8%-200+100",…
Neal Barsch
  • 2,810
  • 2
  • 13
  • 39
9
votes
3 answers

How to get empty last elements from strsplit() in R?

I need to process some data that are mostly csv. The problem is that R ignores the comma if it comes at the end of a line (e.g., the one that comes after 3 in the example below). > strsplit("1,2,3,", ",") [[1]] [1] "1" "2" "3" I'd like it to be…
ceiling cat
  • 5,501
  • 9
  • 38
  • 51
9
votes
4 answers

Extracting nth element from a nested list following strsplit - R

I've been trying to understand how to deal with the output of strsplit a bit better. I often have data such as this that I wish to split: mydata <- c("144/4/5", "154/2", "146/3/5", "142", "143/4", "DNB", "90") #[1] "144/4/5" "154/2" "146/3/5"…
jalapic
  • 13,792
  • 8
  • 57
  • 87
9
votes
7 answers

R - Using str_split and unlist to create two columns

I have a dataset that has dates and interest rates in the same column. I need to split these two numbers into two separate columns, however when I use the following code: Split <- str_split(df$Dates, "[ ]", n = 2) Dates <- unlist(Split)[1] Rates…
j riot
  • 544
  • 3
  • 6
  • 16
9
votes
1 answer

R: split text with multiple regex patterns and exceptions

Would like to split a vector of character elements text in sentences. There are more then one pattern of splitting criteria ("and/ERT", "/$"). Also there are exceptions(:/$., and/ERT then, ./$. Smiley) from the patterns. The try: Match the cases…
alex
  • 1,103
  • 1
  • 14
  • 25
9
votes
3 answers

removing particular character in a column in r

I have a table called LOAN containing column named RATE in which the observations are given in percentage for example 14.49% how can i format the table so that all value in rate are edited and % is removed from the entries so that i can use plot…
8
votes
4 answers

How to split a string on first number only

So i have a dataset with street adresses, they are formatted very differently. For example: d <- c("street1234", "Street 423", "Long Street 12-14", "Road 18A", "Road 12 - 15", "Road 1/2") From this I want to create two columns. 1. X: with the…
Jesse
  • 103
  • 1
  • 6
1
2
3
46 47