Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
0
votes
1 answer

lapply not iterating through list in R

I have a list of character strings where there are repeats in some of the strings. For example: [[1]] [1] "gr gal gr gal" [[2]] [1] "gr gal" [[3]] [1] "gr gal ir ol" [[4]] [1] "gr gal gr gal" [[5]] [1] "gr gal" My…
MeeraWhy
  • 93
  • 6
0
votes
1 answer

Count unique string in a character column

I have a column that has string observations and I need to count the unique words in that column. For e.g. I would like my final output to look like this- The words in the column are separate using blanks so that is another challenge in my case.…
0
votes
2 answers

R remove string before delimiter

I have one of the column in the dataframe and I would like to remove part of the string before the 5th delimiter "." and the last "." for .txt and I don't know how to do…
Peter Chung
  • 1,010
  • 1
  • 13
  • 31
0
votes
3 answers

String splitting a dataframe with a vector as the pattern in R

I have a dataframe that consists of multiple rows, and I would like to split every row into two components based off of elements of a vector (essentially run strsplit with a vector as the 'pattern') in R. The dataframe (only one column) looks…
maria
  • 15
  • 3
0
votes
2 answers

Maximum exposure of customer - splitting reported balance by reported month and dynamically assigning it

I have a string of reported balance and reported months from credit info bureau. I want to calculate consumer's exposure by reported month. I have ~2 million records to process and I am looking for a solution in R. I/P data: df <- data.frame("id" =…
0
votes
3 answers

How to split letters with bracket and numbers in R?

The string is s = '[12]B1[16]M5' I want to split it as the following results with strsplit function in R: let <- c('[12]B', '[16]M') num <- c(1, 5) Thanks a lot
Wang
  • 1,314
  • 14
  • 21
0
votes
0 answers

Split string according to ambiguous delimiter in R

I have a pairs of strings included in a data frame: df <- data.frame(str = c("L_V1_ROI-L_MST_ROI", "L_V6_ROI-L_V2_ROI", "L_V3_ROI-L_V4_ROI", "L_V8_ROI-L_4_ROI", …
Andrej
  • 3,719
  • 11
  • 44
  • 73
0
votes
0 answers

Separating Categorical Data Variable into Multiple Variables

I have this type of data (name: pmt4651) and want to separate and keep only the answers as independent elements (maybe in individual columns so that I can create a function to count and analyze them): 2 …
SVogels
  • 3
  • 2
0
votes
3 answers

R Split delimited strings in a column and insert as new column (in binary)

I have data frame as below +---+-----------+ |lot|Combination| +---+-----------+ |A01|A,B,C,D,E,F| |A01|A,B,C | |A02|B,C,D,E | |A03|A,B,D,F | |A04|A,C,D,E,F | +---+-----------+ Each of the alphabet is a character separated by comma, I…
yc.koong
  • 175
  • 2
  • 10
0
votes
1 answer

Split text string in a data.table columns by location

I am trying to figure it out how I can use tstrisplit() function from data.table to split a text by location number. I am aware of the Q1, Q2 & Q3 but these do not address my question. as an example : DT2 <- data.table(a =…
Daniel
  • 1,202
  • 2
  • 16
  • 25
0
votes
2 answers

Split a list whose elements are multiple element lists

Say I have a list a which is defined as: a <- list("aaa;bbb", "aaa", "bbb", "aaa;ccc") I want to split this list by semicolon ;, get only unique values, and return another list. So far I have split the list using str_split(): a <- str_split(a, ";")…
Kyle Weise
  • 869
  • 1
  • 8
  • 29
0
votes
0 answers

Produce vector of TRUE grepl variables

I have this dataset: http://media.sdsoybean.org/images/Uploads/2016%20South%20Dakota%20Soybean%20Yield%20Contest%20Results.pdf I want to make a new vector called Method. This will consist of the values No-till, Non-irrigated or Irrigated. I can…
0
votes
1 answer

How do I index through a split string contained in a vector?

I've got a section of code, let's assume it's x <- c("10/05/1997 00:00:00", "11/05/1997 00:00:00", "12/05/1997 00:00:00") x <- strsplit(as.character(x), " ", fixed=TRUE)[1] The issue I'm running into is this: I want to take the first index of the…
0
votes
1 answer

How do I transform Corpus content to vector after newline "\n"

When I try to use strsplit on plain text, it has the desired property that the value stored is transformed from a string of characters to a vector with strings of characters. For example, txt = "The fox is Brown.\nThe Fox has a tail." strsplit(txt,…
Aaron
  • 379
  • 5
  • 14
0
votes
4 answers

Splitting numeric column data by given number of characters

I am trying to split one column into three columns so I can give a date format. Currently the data set looks like this YYYYMMDD Number 20020101 0.21 20020102 0.34 20020103 1.22 I want it to look like this Year …
Fosulli
  • 13
  • 4