Datomic: How do I query across any number of database inside of a query?

Question

I'm using Datomic and would like to pull entire entities from any number of points in time based on my query. The Datomic docs have some decent examples about how I can perform queries from two different database instances if I know those instances before the query is performed. However, I'd like my query to determine the number of "as-of" type database instances I need and then use those instances when pulling the entities. Here's what I have so far:

(defn pull-entities-at-change-points [entity-id]
  (->>
    (d/q
      '[:find ?tx (pull ?dbs ?client [*])
        :in $ [?dbs ...] ?client
        :where
        [?client ?attr-id ?value ?tx true]
        [(datomic.api/ident $ ?attr-id) ?attr]
        [(contains? #{:client/attr1 :client/attr2 :client/attr3} ?attr)]
        [(datomic.api/tx->t ?tx) ?t]
        [?tx :db/txInstant ?inst]]
      (d/history (d/db db/conn))
      (map #(d/as-of (d/db db/conn) %) [1009 1018])
      entity-id)
    (sort-by first)))

I'm trying to find all transactions wherein certain attributes on a :client entity changed and then pull the entity as it existed at those points in time. The line: (map #(d/as-of (d/db db/conn) %) [1009 1018]) is my attempt to created a sequence of database instances at two specific transactions where I know the client's attributes changed. Ideally, I'd like to do all of this in one query, but I'm not sure if that's possible.

Hopefully this makes sense, but let me know if you need more details.

score 4 · Accepted Answer · answered Oct 21 '15 at 18:27

I would split out the pull calls to be separate API calls instead of using them in the query. I would keep the query itself limited to getting the transactions of interest. One example solution for approaching this would be:

(defn pull-entities-at-change-points
  [db eid]
  (let 
    [hdb (d/history db)
     txs (d/q '[:find [?tx ...]
                :in $ [?attr ...] ?eid
                :where
                [?eid ?attr _ ?tx true]]
              hdb
              [:person/firstName :person/friends]
              eid)
      as-of-dbs (map #(d/as-of db %) txs)
     pull-w-t (fn [as-of-db]
                [(d/as-of-t as-of-db)
                 (d/pull as-of-db '[*] eid)])]
    (map pull-w-t as-of-dbs)))

This function against a db I built with a toy schema would return results like:

([1010
  {:db/id 17592186045418
   :person/firstName "Gerry"
   :person/friends [{:db/id 17592186045419} {:db/id 17592186045420}]}]
 [1001
  {:db/id 17592186045418
   :person/firstName "Jerry"
   :person/friends [{:db/id 17592186045419} {:db/id 17592186045420}]}])

A few points I'll comment on:

the function above takes a database value instead of getting databases from the ambient/global conn.
we map pull for each of various time t's.
using the pull API as an entry point rather than query is appropriate for cases where we have the entity and other information on hand and just want attributes or to traverse references.
the impetus to get everything done in one big query doesn't really exist in Datomic since the relevant segments will have been realized in the peer's cache. You're not, i.e., saving a round trip in using one query.
the collection binding form is preferred over contains and leverages query caching.

Thank you for the response. That's very helpful! – Stephen Hopper Oct 21 '15 at 19:21 — Stephen Hopper, Oct 21 '15 at 19:21

Datomic: How do I query across any number of database inside of a query?

1 Answers1