Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
0
votes
1 answer

String split: need to account for space before characters?

I am adapting a code for my own needs, which has problems. I've been able to address most of the issues but am stuck on this current step. I've uploaded a pdf into R and have done a series of steps to manipulate the file for text mining. I'm now…
Tammy
  • 173
  • 9
0
votes
2 answers

Combine vector and list in R

I am splitting a column in a dataset by using strsplit and wish to map one column to the split data. Here is a sample dataset: https://drive.google.com/file/d/1jtrn6Htezz6iRhJN0HaxXowT5JZW52ai/view?usp=sharing My code is as…
Matt
  • 45
  • 4
0
votes
2 answers

search identical characters standing in vector (not whitespace, not tabulators), and left one such character in R

How From a chain of identical characters standing in a row (not whitespace, not tabulators), only one such character must be left? I mean next. here input data 5 NO 58AA WOD~05293_NODC~58AA 005450 WOD~NO005450 6246.630096435547 418.6500072479248…
d-max
  • 167
  • 13
0
votes
1 answer

Regular expression to separate string containing upper and lower case

I can't get an regular expression task working, it would be great if someone could help. I need to separate gene names from descriptions that are attached to them. Using a term that appeared in 99% of cases involved separating it from "GeneCards…
Sebastian Hesse
  • 542
  • 4
  • 16
0
votes
2 answers

strsplit() output as a dataframe in r

I have some results from a model in Python which i have saved as a .txt to render in RMarkdown. The .txt is this. precision recall f1-score support 0 0.71 0.83 0.77 1078 1 0.76 …
der_radler
  • 549
  • 1
  • 6
  • 17
0
votes
1 answer

Using regexp to find a reoccurring pattern in MATLAB

input = ' 12Z taj 20501 jfdjda OCNL jtjajd ptpa 23Z jfdakdkf tjajdfk OCNL fdkadja 02Z fdjafsdk fkdsafk OCNL fdkafk dksakj = ' using regexp regexp(input,'\s\d{2,4}Z\s.*(OCNL)','match') I'm trying to get the output [1,1] = 12Z taj 20501 jfdjda…
John
  • 157
  • 1
  • 10
0
votes
0 answers

Split string with no delimiter based on countrycode and countryname

I don't know if the question has been asked already, but could not find the right answer. I have strings in a column with countrycodes, countrynames and a date, with no delimiters: tst <- c("NLNETHERLAND2018-01-19","IRQIRAQ1912-02-28") How could I…
C Visser
  • 1
  • 2
0
votes
2 answers

Split string in column and create new columns with the output (r)

How do I split the first column into 2 components (e.g., 01 & run1) and create 2 other columns to store that information? P = c('01_run1', '01_run2', '02_run1', '02_run2') Score = c(1, 2, 3, 4) df = data.frame(P, Score) P Score 1 01_run1 …
TYL
  • 1,577
  • 20
  • 33
0
votes
1 answer

Matlab, Comma-separated string to Cell when it has blank e.g. 1,2,3,[blank],[blank]

In Matlab, 1. strsplit('a,b,c,', ',') 2. strsplit('a,b,c,,,', ',') both results of 1 and 2 are same, {{'a'}, {'b'}, {'c'}, {0×0 char}} However I want to take {{'a'}, {'b'}, {'c'}, {0×0 char}, {0×0 char}, {0×0 char}} from a…
J. Kim
  • 95
  • 7
0
votes
1 answer

simple character splitting is baffling me

the split as shown below is driving me crazy...nee somehelp to spot where is the problem > p5<-Data$poorcoverageusers[5] > p5 [1] "405874050693761|405874004853834|405874056470063|405874055308702" > strsplit(p5,"|") [[1]] [1] "4" "0" "5" "8" "7" "4"…
rajibc
  • 93
  • 10
0
votes
1 answer

How to split strings at the end of last letter?

Could you please show me how to split strings at the end of last letter? I have some data read from xlsx. Please take a look from the pic. The strings consists of a english word and some other character (could be number or ?,!, etc). There is…
shy zhan
  • 35
  • 6
0
votes
0 answers

Use of lapply after splitting a string in R

I wanted to split a column in R into two new columns. I came across the following code on the internet which worked quite well: as.numeric(lapply(strsplit(data$Coord, split=","), "[", 1)) My question is, how exactly does this work, in particular…
0
votes
1 answer

R: Combine strsplit and rbind in dataframe

Although I see several similar issues on stackoverflow regarding this problem I cannot get my syntax to work. I want to split comma separated values into new columns in my dataframe. When I use the following syntax the resulting dataframe doe not…
Joep_S
  • 481
  • 4
  • 22
0
votes
3 answers

split string and concatenating to remove a portion of string

I am trying to remove a portion of a string. The best I can come up with is to strsplit and then concatenate (maybe there is an easier way. list<-as.character(c("joe_joe_ID1000", "bob_bob_ID20000")) list<-strsplit(list, "_") I would like my output…
0
votes
2 answers

R: strsplit based on two conditions, keeping deliminator

I am trying to split sentences based on different criteria. I am looking to split some sentences after "traction" and some after "ramasse". I looked up the grammar rules for grepl but didn't really understand. A data frame called export has a column…
Makoto Miyazaki
  • 1,743
  • 2
  • 23
  • 39