0

I have multiple text files stored in different folders. The structure of data stored in each .txt file is the same. Each text file starts with some information and 5th line forward, the file has a sign */ after which data starts. The file appears as follows:

*/
Date  AirTemp  Pres  Wind ....
2021-03-01 27  1017  10....
2021-03-02..... 

I want to read only the data in these .txt files and not information stored prior to */. For this, I am using the following code in R where I am commanding to consider */ as the last line before the actual data starts.

setwd("D:/Test")
library(stringr)
con <- file("D:/Test/Data1.txt", "r")
lines <- c()
while(TRUE) {
 line = readLines(con, 1)
 if(length(line) == 0) break
 else if(grepl("^\\s*/{1}", line)) lines <- c(lines, line)
}

On running the code, it is not returning me anything and 'lines' appear to be empty. Could anyone help me with this.

  • 1
    You can use the `skip` argument in `read.table` to ignore some lines. Try `read.table("D:/Test/Data1.txt", skip = 5)` (or `skip = 6`, depending on the actual number of lines prior to the column header) – benson23 May 03 '23 at 13:54

1 Answers1

1

If for some reason the characters aren't always on the fifth line, you could do something like this:

s = 'hi hello
hi
*/
Date     AirTemp  Pres  Wind 
2021-03-01 27     1017  10
2021-03-02 27     1999  21 '


df = s %>% str_remove(regex('.*\\*\\/', dotall = TRUE)) %>% 
           read.table(text = ., header = T)


# Date AirTemp Pres Wind
# 1 2021-03-01      27 1017   10
# 2 2021-03-02      27 1999   21
Juan C
  • 5,846
  • 2
  • 17
  • 51