1

According to Datomic's Connection documentation:

Datomic connections do not adhere to an acquire/use/release pattern. They are thread-safe, cached, and long lived. Many processes (e.g. application servers) will never call release.

I'm interested to know how this is achieved in practice, specifically for sql connections. From the client/user perspective this is great as you don't need to worry about the thread pool at all which simplifies client code, and what you need to reason about signficantly. It's something I'd love to replicate in other applications with SQL connections.

Breaking down the question into smaller parts:

  • What challenges needed to be considered when treating Datomic connections as long lived?
  • Is the approach suitable in general when dealing with JDBC connections, or is it only suitable for a sub class of problems (including Datomic's)?
  • I can see that Tomcat's JDBC connection pool is used under the hood, how is this pooling used to achieve long lived connections from the Datomic connection perspective?
  • In practice when do you use separate JDBC connections behind the scenes e.g. do you use separate connections for reads vs writes?
Matthew Gretton
  • 333
  • 2
  • 11
  • The first think I'd investigate is, why the *connection pool is used at all* when *many processes ... will never call release* of a connection... – Marmite Bomber Nov 16 '20 at 13:40

1 Answers1

2

"It depends" :)

For example, if you have memcached set up, the Datomic peer potentially doesn't have to talk to the sql database at all. Datomic uses sql to fetch chunks of encoded data, not to do structured queries, so if a block is present in memcached, no sql is needed. Also, the peer will get new chunks sent to it by the transactor so if you're particularly lucky, everything is already available in your peer before you run your first query.

If a chunk is not already in the peer, and is not already in memcached, the peer needs to connect to the sql database fetch chunks. But this all happens under the hood, and is managed by the tomcat connection pool as you mentioned. Generally, the idea is that for a query to run successfully, the index has to pull any missing chunks from storage (memcached, sql, ...), and this happens in a lazy fashion. But the datomic connection itself "lives forever", i.e. this is all managed for you, and you don't have to create N connections depending on the amount of traffic your peer has etc etc.

As for writes, those go through the transactor and does not connect directly to storage. Writes are represented as the EDN datastructure we're all familiar with (the list of lists with db/add etc etc), and is shipped of to the transactor to a queue and processed in sequence. The transactor then connects directly to storage when it needs to, but that's obviously a separate concern that does not affect the peer in any way.

I hope this was clarifying :)

August Lilleaas
  • 54,010
  • 13
  • 102
  • 111