3

I currently use the following method of reading a file line by line:

(for [(line (in-lines))]

However, right now my code is too slow. Is there a "faster" way to read the input line by line?

GAD3R
  • 4,317
  • 1
  • 23
  • 34
Wallace
  • 107
  • 4
  • 1
    hi there. can you post some info on how "slow" it is and what you expected it to be? are you just loading the file or trying to print anything? because that could be the culprit. Finally you could use some other function to buffer things instead of line by line. for example you could read the whole thing in memory if it fits and then split it, or read by some blocksize and split etc. check out [this answer for ideas](https://stackoverflow.com/questions/42028610/how-to-load-a-huge-file-into-a-string-or-list-in-racket) – ramrunner Feb 01 '19 at 19:35

1 Answers1

4

Like ramrunner, I suspect that the problem is somewhere else. Here's a short program that I wrote that generates a 10 Megabyte text file and then reads it in using 'in-lines'.

#lang racket

(define chars (list->vector (string->list "abcde ")))
(define charslen (vector-length chars))

(define (random-line)
  (list->string
   (for/list ([i (in-range 80)])
     (vector-ref chars (random charslen)))))

(define linecount (ceiling (/ (* 10 (expt 10 6)) 80)))

(time
 (call-with-output-file "/tmp/sample.txt"
   (λ (port)
     (for ([i (in-range linecount)])
       (write (random-line) port)))
   #:exists 'truncate))

;; cpu time: 2512 real time: 2641 gc time: 357

;; okay, let's time reading it back in:

(time
 (call-with-input-file "/tmp/sample.txt"
   (λ (port)
     (for ([l (in-lines port)])
       'do-something))))

;; cpu time: 161 real time: 212 gc time: 26
;; cpu time: 141 real time: 143 gc time: 23
;; cpu time: 144 real time: 153 gc time: 22

(the times here are all in milliseconds). So it takes about a sixth of a second to read in all the lines in a 10 megabyte file.

Does this match what you're seeing?

John Clements
  • 16,895
  • 3
  • 37
  • 52
  • Nice example. I changed `random-line` to `(build-string 80 (λ (i) (vector-ref chars (random charslen))))` to get a little bit of speed improvement. – Ludovic Kuty Dec 19 '20 at 06:54