R - split file from gnuplot into 2 parts

Question

I am trying to modify data which were originally used for gnuplot. Data looks like this:

1 2 3 4 5
6 7 8 9 10



1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

I want to read only the second part which contains 11 columns. Gnuplot does it in a smart way by indicating ind 0 or ind 1. Any idea how to split the file in R?

if you know how many lines to skip, use the `skip` argument of `read.table`. Otherwise, you could use `readLines` and filter the results that have length 11. — baptiste, Aug 29 '14 at 15:20
Actually for different files the number of lines to skip differs. — Wiolos, Aug 29 '14 at 15:29
@Wiolos, then you can put those differing `skip` numbers into a list and `lapply(skips, ...)` with `read.table`. — Rich Scriven, Aug 29 '14 at 17:32

score 0 · Answer 1 · answered Aug 29 '14 at 15:45

scan() is one option

txt = "1 2 3 4 5
6 7 8 9 10



1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11"

tc = textConnection(txt)
d = scan(tc, blank.lines.skip = FALSE)
close(tc)
nas = which(is.na(d)) # delimiters
d = d[-unique(c(seq.int(nas[1]), nas))] # remove NAs and first items
matrix(d, ncol=11, byrow=TRUE)

score 0 · Answer 2 · answered Aug 29 '14 at 16:21

You could also try:

 lines <- readLines(n=25)
 1 2 3 4 5
 6 7 8 9 10



 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11
 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 1.10 1.11
 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11
 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11

 library(stringr)
 d1 <- read.table(text=lines[str_count(lines, "\\s+")==10], sep="", header=F)
 dim(d1)
#[1] 18 11

R - split file from gnuplot into 2 parts

2 Answers2