Questions tagged [strsplit]

strsplit is a function in R and MATLAB which splits the elements of a character vector around a given delimiter.

strsplit is a function in R (documentation) and MATLAB (documentation), which splits the elements of a character vector into substrings:

# R:  
strsplit(x, split, fixed=FALSE)
% MATLAB
strsplit(x, split);

Splits a character string or vector of character strings using a regular expression or a literal (fixed) string. The strsplit function outputs a list (R) or cell array (MATLAB), where each list item corresponds to an element of x that has been split.

  • x a character string or vector of character strings to split.
  • split the character string to split x.
    In R, if split is an empty string (""), then x is split between every character.
  • [R only:] fixed if the split argument should be treated as fixed (i.e. literally). By default, the setting is FALSE, which means that split is treated like a regular expression.
702 questions
2
votes
4 answers

Remove from vector elements containing a number in R

I have a few files that are named after rural properties like the following: v1 <- c("Badger", "Hill", "Farm", "1.json") v2 <- c("Buffalo", "Pass", "Farm", "2.json") > v1 [1] "Badger" "Hill" "Farm" "1.json" > v2 [1] "Buffalo" "Pass" "Farm" …
thiagoveloso
  • 2,537
  • 3
  • 28
  • 57
2
votes
1 answer

Split a string of names and transpose

I have a list of names (famous directors) that is in format of First, (possible middle), and Last Name which I need to rearrange to have Last Name, First (possible middle). I can't just split all of these by the first space, or even second space…
data_life
  • 387
  • 1
  • 11
2
votes
2 answers

strsplit returning nested list with backslashes and quotes added \"

I'm using R to split a messy string of gene names and as a first step am simply attempting to break the string into a list by spaces between characters using strsplit and regex but have been coming across this weird bug: string <- ' " "KPNA2" …
2
votes
1 answer

Split string with delimiter except when in parentheses, and keep the delimiter

I'd like to split an arbitrary string such as x <- "(((K05708+K05709+K05710+K00529) K05711),K05712),K05713 K05714 K02554" # [1] "(((K05708+K05709+K05710+K00529) K05711),K05712),K05713 K05714 K02554" at delimiter(s) (here a space and a comma) except…
yuppe
  • 23
  • 4
2
votes
4 answers

string split values in two columns, and then concatenate them into a new column

I am trying to call the str_split function for both columns (Proteins and Positions.within.proteins), and then concatenate the corresponding values in a new column called ID. df <- data.frame(Proteins = c("Q99755;A2A3N6", "O00329", "O00444",…
2
votes
4 answers

R:: reverse a string from "x,y" to "y,x" in r

I have a table like this aa<-tribble( ~"a",~"b",~"c",~"d", " 78.1445111, 9.9365072", "78.1444646, 9.9365044", " 78.1445111, 9.9365072", "78.1444646, 9.9365044", "78.1444197, 9.9365166", "78.1443816, 9.9365422", "78.142359, 9.9365748",…
Betel
  • 151
  • 7
2
votes
2 answers

Formating a txt file

I have a TXT file formatting that looks like: 123451234512345 123451234512345 I want to format the file with php in this format:. 12345-12345-12345 12345-12345-12345 This is what I have tried: $codes = "codes.txt"; $unformatedCode = file($codes,…
OUD
  • 23
  • 4
2
votes
4 answers

R Changing Elements in a Dataframe

I'm trying to, as the title says, change elements from my dataframe from one character to another. The dataframe is as follows: g1=c("CC","DD","GG") g2=c("AA","BB","EE") g3=c("HH","II","JJ") df=data.frame(g1,g2,g3) I wish to convert the elements…
Amomynous
  • 95
  • 1
  • 2
  • 7
2
votes
2 answers

R - Merge unique values from two lists using stringr::str_split

I have a function, that when given a list of strings, should return a vector of all unique strings of N size. get_unique <- function (input_list, size = 3) { output = c() for (input in input_list) { current = stringr::str_replace(input,…
Vaune_
  • 49
  • 5
2
votes
2 answers

Splitting values in different columns in R

One of the column in my dataset contains values like utm_source=google&utm_medium=cpc&utm_campaign=1234567&utm_term=brand%20&utm_content=Brand&gclid=ERtyuiipotf_YTj How should I split this in different columns with its values in R? utm_source…
anonymus
  • 23
  • 5
2
votes
4 answers

Split a string, tokenize substrings, and convert tokens to numeric vectors

I have a character string: String <- "268.1,271.1,280.9,294.7,285.6,288.6,384.4\n124.8,124.2,116.2,117.7,118.3,122.0,168.3\n18,18,18,18,18,18,18" I would like to split it into three substrings based on \n. I did that using the following…
Mo.ms
  • 33
  • 1
  • 7
2
votes
1 answer

rsplit() is not working to split columns using regex

Original df import pandas as pd df = pd.DataFrame({ 'Ref':['CU12','SE00', 'RLA1234', 'RLA456', 'LU00', 'RLA1234MA12','RLA1234MA13', 'CU00','LU00'] } ) Ref 0 CU12 1 SE00 2 RLA1234 3 12345 4 RLA456 5 LU00 6 RLA1234MA12 7 …
Lavina Khushlani
  • 509
  • 2
  • 7
  • 14
2
votes
2 answers

How can I give sequential names to items in a list?

I have a character string that I have split into a list of smaller strings using strsplit. For example: > full.seq <- "FZpcgK3VdAQzEFZpcAVdV8QM8ZpsEFZpacgGKi3VdVSQzEFZpcgGKAVdVRpEFKGIZpg13" > full.seq [1]…
2
votes
2 answers

I'm getting NA's applying separate() function over column of characters in R

I'm trying to split a column that are formatted very differently. For example: pharma <- c("DOXORUBICINA CLORH. FAM 50MG POL O LIOF", "DROSPIRENONA/ETINILESTR. 3/0,02MG CM REC", "DROSPIRENONA/ETINILESTR.…
2
votes
4 answers

how to split a string column in R based on equal length and get them in different rows

library(tidyr) library(dplyr) mydf V1 V2 2 1 abcdef 3 2 abcd 4 3 bghj 5 4 kl 6 5 uilm I want to get my data frame that in result V2 column should be separated in the length of 2 in separate rows V1 V2 1 1 ab 2 1 cd 3 …