6

How do we read a specific file line by line while skipping some columns in it?

For example, I have a text file which has data, sorted out in 5 columns, but I need to read only two columns out of it, they can be first two or any other random combination (I mean, need a solution which would work with any combination of columns like first and third only).

Code something like this

        open(1, file=data_file)
        read (1,*) ! to skip first line, with metadata
        lmax = 0
        do while (.true.)
                ! read column 1 and 3 here, either write
                ! that to an array or just loop through each row
        end do
99      continue        
        close (1)

Any explanation or example would help a lot.

francescalus
  • 30,576
  • 16
  • 61
  • 96
Indigo
  • 2,887
  • 11
  • 52
  • 83

2 Answers2

2

High Performance Mark's answer gives the essentials of simple selective column reading: one still reads the column but transfers it to a then-ignored variable.

To extend that answer, then, consider that we want to read the second and fourth columns of a five-column line:

read(*,*) junk, x, junk, y

The first value is transferred into junk, then the second into x, then the third (replacing the one just acquired a moment ago) into junk and finally the fourth into y. The fifth is ignored because we've run out of input items and the transfer statement terminates (and the next read in a loop will go to the next record).

Of course, this is fine when we know it's those columns we want. Let's generalize to when we don't know in advance:

integer col1, col2   ! The columns we require, defined somehow (assume col1<col2)
<type>, dimension(nrows) :: x, y, junk(3)  ! For the number of rows
integer i

do i=1,nrows
  read(*,*) junk(:col1-1), x(i), junk(:col2-col1-1), y(i)
end do

Here, we transfer a number of values (which may be zero) up to just before the first column of interest, then the value of interest. After that, more to-be-ignored values (possibly zero), then the final value of interest. The rest of the row is skipped.

This is still very basic and avoids many potential complications in requirements. To some extent, it's such a basic approach one may as well just consider:

do i=1,nrows
  read(*,*) allofthem(:5)
  x(i) = allofthem(col1)
  y(i) = allofthem(col2)
end do

(where that variable is a row-by-row temporary) but variety and options are good.

francescalus
  • 30,576
  • 16
  • 61
  • 96
1

This is very easy. You simply read 5 variables from each line and ignore the ones you have no further use for. Something like

do i = 1, 100
    read(*,*) a(i), b, c(i), d, e
end do

This will overwrite the values in b, d, and e at every iteration.

Incidentally, your line

99 continue

is redundant; it's not used as the closing line for the do loop and you're not branching to it from anywhere else. If you are branching to it from unseen code you could just attach the label 99 to the next line and delete the continue statement. Generally, continue is redundant in modern Fortran; specifically it seems redundant in your code.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • I don't see how one can modify this to make it work when the number of columns in not known in advanced. – Peaceful Sep 20 '16 at 18:26
  • Is there a neater solution when you have a file with many columns? For example, I am currently looking at a file with 72 columns a solution like this one is quite cumbersome when I am only interested in the data contained in a few of the columns. – Mead Jan 04 '19 at 19:35