-4

I have the following content in list

"[4]  {PIVOTAL GEMFIRE}                             => {HIBERNATE}                    0.005952381 1.0000000  168.000000 2    \r"

and i need to get it as:

 4,{PIVOTAL GEMFIRE},{HIBERNATE},0.005952381,1.0000000,168.000000,2

and I need to place it rowwise in a dataframe

zx8754
  • 52,746
  • 12
  • 114
  • 209
Ellanti Kishore
  • 104
  • 1
  • 9
  • Hello and Welcome ! Please explain what you did and what you get (so we can understand your problem and try to help to to fix it). We are not free developing center :P (But I would say to answer to try to use regular expression like `([^\s]+)\s*` ang get the output (something like a list in some languages) and then adding coma to separate values) – NatNgs Feb 13 '18 at 13:39
  • when i tried to use regular expression(when i seperated using space),it is coming like {PIVOTAL and GEMFIRE} as two seperate values but i want it as {PIVOTAL GEMFIRE} – Ellanti Kishore Feb 13 '18 at 13:53
  • Rather than explain what you did in words add the exact code to the question. – G. Grothendieck Feb 13 '18 at 14:35

2 Answers2

3

You can do (string being your string):

gsub("((\\S+)|(\\{[^{}]+\\}))\\s", "\\1,", trimws(gsub("[^[:alnum:].{}]+", " ", string)))

Explanation:

  • gsub("[^[:alnum:].{}]+", " ", string): replace everything that is not an alphanumeric character or a curly bracket or a dot (and which can occur multiple times) by a space
  • trimws(...): remove the leading and trailing spaces from the modified string you just got
  • gsub("((\\S+)|(\\{[^{}]+\\}))\\s", "\\1",...): in the former result, capture everything that is before a space and composed of non spaces or anything in between curly brackets and put a comma after.

Then you can just read in your vector with read.table, with sep="," to put in a data.frame

Test:

string <- "[4]  {PIVOTAL GEMFIRE}                             => {HIBERNATE}                    0.005952381 1.0000000  168.000000 2    \r"

read.table(text=gsub("((\\S+)|(\\{[^{}]+\\}))\\s", "\\1,", trimws(gsub("[^[:alnum:].{}]+", " ", string))), sep=",")

#  V1                V2          V3          V4 V5  V6 V7
#1  4 {PIVOTAL GEMFIRE} {HIBERNATE} 0.005952381  1 168  2
Cath
  • 23,906
  • 5
  • 52
  • 86
  • what is the purpose of keeping "\\1," is gsub – Ellanti Kishore Feb 14 '18 at 03:38
  • @EllantiKishore `\\1` is to put what has been captured (in between brackets in the regex) in the result. So you capture what is before a space that you want to replace with a comma and you put it in the result, followed by a comma. – Cath Feb 14 '18 at 08:13
1

Replace each =>, [ and ] with the empty string giving s1 and then replace any digit or } followed by space by that same digit or } followed by a comma. Then read it in using commas as the separator. If comma can appear in the content then use a different separator character.

s1 <- gsub("=>|[][]", "", DF$x)
s2 <- gsub("([0-9}]) ", "\\1,", s1)
read.table(text = s2, as.is = TRUE, strip.white = TRUE, sep = ",")[-8]

giving:

  V1                V2          V3          V4 V5  V6 V7
1  4 {PIVOTAL GEMFIRE} {HIBERNATE} 0.005952381  1 168  2
2  4 {PIVOTAL GEMFIRE} {HIBERNATE} 0.005952381  1 168  2

Note

Test data used:

x <- "[4]  {PIVOTAL GEMFIRE}                             => {HIBERNATE}                    0.005952381 1.0000000  168.000000 2    \r"
DF <- data.frame(x = c(x, x), stringsAsFactors = FALSE)

EDIT: Added missing }.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341