4

I have a large text file I want to process in Clojure.
I need to process it 2 lines at a time.

I settled on using a for loop so I could pull 2 lines for each pass with the following binding (rdr is my reader):

[[line-a line-b] (partition 2 (line-seq rdr))]

(I would be interested in knowing other ways to get 2 lines for each loop iteration but that is not the point of my question).

When trying to get the loop to work (using a simpler binding for these tests), I am seeing the following behavior that I can't explain:

Why does

(with-open [rdr (reader "path/to/file")]  
    (for [line (line-seq rdr)]  
          line))  

trigger a Stream closed exception

while

(with-open [rdr (reader "path/to/file")]  
    (doseq [line (line-seq rdr)]  
        (println line)))

works?

ppbitb
  • 519
  • 1
  • 7
  • 19

1 Answers1

7

for is lazy and just returns the head of the sequence that will eventually read the data from the file. The file is already closed when the for's contents are printed by your repl. you can fix this pu wrapping the for in a doall

(with-open [rdr (reader "path/to/file")]  
    (doall (for [line (line-seq rdr)]  
      line)))  

Though this unlazys the sequence.

here is a sample of a function out of my misc.clj that lazily closes the file at it's end:

(defn byte-seq [rdr]
  "create a lazy seq of bytes in a file and close the file at the end"
  (let [result (. rdr read)]
    (if (= result -1)
     (do (. rdr close) nil)
     (lazy-seq (cons result (byte-seq rdr))))))
Arthur Ulfeldt
  • 90,827
  • 27
  • 201
  • 284
  • 1
    ``(byte-seq)`` looks nice - Have you considered exceptions from ``(.rdr read)`` ? could you get, say, an encoding exception and leak the open stream? – sw1nn Aug 17 '12 at 09:09
  • yes, that's completely correct. It also leaks the handle if you don't read the whole sequence. – Arthur Ulfeldt Aug 17 '12 at 18:15