0

I am implementing a streaming source with ScalikeJDBC and need this to run over multiple DB types incl. Oracle, Sybase etc.

The documentation is slightly confusing and not sure whether this is an option:

At the current moment, scalikejdbc-streams natively supports MySQL and PostgreSQL. When using SQL#iterator factory method normally, ScalikeJDBC automatically enables required settings to use cursor feature. If you don’t prefer the behavior, you can customize adjusting DBSession attributes instead

Can streaming reads be handled over other DBs than MySQL and PostgreSQL?

Evan M.
  • 403
  • 5
  • 13

1 Answers1

1

(Since your question is about creating a streaming source, this answer addresses only the publisher side of the streaming support and ignores the subscriber side.)

Support for streaming requires that the database return query results a few rows at a time, typically based on a cursor, instead of all at once. Different databases have different ways to enable this. ScalikeJDBC natively supports the use of the streaming iterator method for the MySQL and PostgreSQL drivers. That is, with the MySQL and the PostgreSQL driver, the following works:

import scalikejdbc._
import scalikejdbc.streams._

// set up a connection pool

import scala.concurrent.ExecutionContext.Implicits.global

val publisher: DatabasePublisher[Int] = DB.readOnlyStream {
  sql"select id from users order by id".map(r => r.get[Int]("id")).iterator
}

The above works for MySQL and PostgreSQL because of this:

/**
 * Forcibly changes the database session to be cursor query ready.
 */
val defaultDBSessionForceAdjuster: DBSessionForceAdjuster = (session) => {

  // setup required settings to enable cursor operations
  session.connectionAttributes.driverName match {
    case Some(driver) if driver == "com.mysql.jdbc.Driver" && session.fetchSize.exists(_ > 0) =>
      /*
       * MySQL - https://dev.mysql.com/doc/connector-j/5.1/en/connector-j-reference-implementation-notes.html
       *
       * StreamAction.StreamingInvoker prepares the following required settings in advance:
       *
       * - java.sql.ResultSet.TYPE_FORWARD_ONLY
       * - java.sql.ResultSet.CONCUR_READ_ONLY
       *
       * If the fetchSize is set as 0 or less, we need to forcibly change the value with the Int min value.
       */
      session.fetchSize(Int.MinValue)

    case Some(driver) if driver == "org.postgresql.Driver" =>
      /*
       * PostgreSQL - https://jdbc.postgresql.org/documentation/94/query.html
       *
       * - java.sql.Connection#autocommit false
       * - java.sql.ResultSet.TYPE_FORWARD_ONLY
       */
      session.conn.setAutoCommit(false)

    case _ =>
  }
}

Notice that the last case clause means that ScalikeJDBC does not by default support a streaming iterator with drivers other than the ones for MySQL and PostgreSQL.

This doesn't mean that one can't use other drivers for streaming. The section of the documentation that you quoted has the following code sample:

val publisher: DatabasePublisher[Int] = DB readOnlyStream {
  sql"select id from users".map(r => r.int("id"))
    .iterator
    .withDBSessionForceAdjuster(session => {
      session.conn.setAutoCommit(true)
    })
}

What the documentation is saying is that to enable streaming for databases other than MySQL and PostgreSQL, you need to customize the DBSession attributes, like in the above example, such that cursor support is enabled. What exactly this customization entails (e.g., adjusting the fetchSize or disabling autoCommit on the connection), depends on the driver (assuming that the driver supports the retrieval of query results a small number of rows at a time).

Jeffrey Chung
  • 19,319
  • 8
  • 34
  • 54