0

I have a folder with tons of txt files from where I have to extract especific data. The problem is that the format of the file has changed once and the position of the data I need to extract has also changed. So I need to deal with files in different format.

To try to make it more clear, in column 4 I have the name of the variable and in 5 I have the value, but sometimes this is in a different row. Is there a way to find the name of the variable (in which row) and then extract its value?

Thanks in advance

EDITING

In some files I will have the data like this:

Column 1-------Column 2.

Device ID------A.

Voltage------- 500.

Current--------28

But in some point in life, there was a change in the software to add another variable and the new file iis like this:

Column 1-------Column 2.

Device ID------A.

Voltage------- 500.

Error------------5.

Current--------28

So I need to deal with these 2 types of data, extracting the same variables which are in different rows.

Gabriela
  • 1
  • 1
  • Please give some example data. Maybe this helps: https://stackoverflow.com/questions/4220631/how-do-i-grep-in-r – AStieb Mar 14 '18 at 14:45

1 Answers1

0

If these files can't be read with read.table use readLines and then find those lines that start with the keyword you need.

For example:

Sample file 1 (with the dashes included and extra line breaks):

Column 1-------Column 2.

Device ID------A.

Voltage------- 500.

Error------------5.

Current--------28

Sample file2 (with a comma as separator):

Column 1,Column 2.
Device ID,A.
Current,555
Voltage, 500.
Error,5.

For both cases do:

text = readLines(con = file("your filename here"))
curr = text[grepl("^Current", text, ignore.case = T)]

Which returns:

for file 1:

[1] "Current--------28"

for file 2:

[1] "Current,555"

Then use gsub to remove anything that is not a number.

R. Schifini
  • 9,085
  • 2
  • 26
  • 32