1

Language: Racket (with SQL query code/pointer) Libraries: db, racket/stream, racket/sequence

Goal: lazily process the value of sql queries using streams in Racket.

Question 1: how do you manipulate SQL query stream objects in Racket? (I can get the stream-first value of the stream but not the rest of the stream!)

#lang racket/base
(require db
         racket/sequence
         racket/stream)

(define db_sql_local
  (mysql-connect
   #:user "<my-username>"
   #:database "<my-database>"
   #:server "<my-server>"
   #:port <my-port>
   #:password "<my-password>"))

;; PROBLEM 1 HERE
(define test-stream
  (sequence->stream
   (in-query
    chembl_sql_local
    "SELECT * FROM <table-name>"
    #:fetch +inf.0)))

(stream-first test-stream)

;; stream-first of test-stream returns the first-row of the table as a '(#vector). 


Any advice or comments would be greatly appreciated - Thank you!

Triage
  • 21
  • 1
  • 3

2 Answers2

2

First, the sequence returned by in-query does not contain vectors; each "element" of the sequence contains multiple values, one per column returned. See the paragraph in the Sequence docs starting "Individual elements of a sequence..." about multiple-valued elements.

Second, using #:fetch +inf.0 (the default behavior) means that all rows are fetched before the sequence is returned. So there's nothing lazy about the code above; you could use query-rows instead and get a list, which would be easier to work with (and query-rows does represent each row as a vector).

Finally, use stream-rest to get the rest of a stream. For example:

(require db racket/stream racket/sequence)
(define c (sqlite3-connect #:database 'memory))
(define qseq (in-query c "SELECT 1, 2 UNION SELECT 3, 4" #:fetch 1))
qseq
;; => #<sequence>
(define qstream (sequence->stream qseq))
qstream
;; => #<stream>
(stream-first qstream)
;; => 1 2
(stream-rest qstream)
;; => #<stream>
(stream-first (stream-rest qstream))
;; => 3 4
Ryan Culpepper
  • 10,495
  • 4
  • 31
  • 30
1

thanks for your quick reply. The #:fetch 1 arg was definitely what I was looking for to make it lazy. I've attached updated code that should lazily stream sql queries to export tsv files.

(define sql_server
  (mysql-connect
   #:user <username>
   #:database <db-name>
   #:server <server>
   #:port <port-num>
   #:password <password>))

(define query-->stream
  (lambda (db-conn query)
    (sequence->stream
     (in-query
      db-conn
      query
      #:fetch 1))))

(define print-table-row-to-tsv
  (lambda (ls port)
    (cond
      ((null? ls)
       (fprintf port "~c" #\newline)
       (void))
      ((sql-null? (car ls))
       (fprintf port "~a~c" "NULL" #\tab)
       (print-table-row-to-tsv (cdr ls) port))
      ((null? (cdr ls))
       (fprintf port "~a" (car ls))
       (print-table-row-to-tsv (cdr ls) port))
      (else
       (fprintf port "~a~c" (car ls) #\tab)
       (print-table-row-to-tsv (cdr ls) port)))))

(define get-table-col-names
  (lambda (db-conn tbl-name)
    (map (lambda (x) (vector-ref x 0))
       (query-rows db-conn (string-append "DESCRIBE " tbl-name)))))

(define export-query-result-to-tsv
  (lambda (db-conn tbl-name query)
    (let* ((tbl-col-names (get-table-col-names db-conn tbl-name))
           (output-file (open-output-file (format "~achembl_~a_table.tsv" (find-system-path 'home-dir) tbl-name) #:exists 'replace))        
           (stream (query-->stream db-conn query)))
      (begin
        (print-table-row-to-tsv tbl-col-names output-file)
        (process-stream-to-tsv stream output-file)
        (close-output-port output-file)))))

(define process-stream-to-tsv
  (lambda (stream port)
    (cond
      ((stream-empty? stream)
       (void))
      (else
       (begin
         (print-table-row-to-tsv (call-with-values (lambda () (stream-first stream)) list) port)
         (process-stream-to-tsv (stream-rest stream) port))))))


(export-query-result-to-tsv sql_server "<table-name>" "SELECT * FROM <table-name>;")

Triage
  • 21
  • 1
  • 3