1

First up, I have a large Sqlite database. I want to copy a subset of the data in this database to another Sqlite database in a streaming fashion using the Haskell persistent library.

Here's the only I managed to do it, but its very slow (and uses more memory) compared to loading the data in memory and then writing it to the new database.

import Database.Persist.Sqlite (SqliteConf(..), runSqlite, runMigration, runSqlPersistMPool)
import Database.Persist (entityVal, Entity, createPoolConfig)
import Data.Pool (Pool)
import Data.Conduit (ConduitT, (.|), runConduit)
import qualified Data.Conduit.List as CL

main :: RIO App ()
main = do
  ipool <- liftIO $ createPoolConfig (SqliteConf "mydb.db" 1)
  opool <- liftIO $ createPoolConfig (SqliteConf "mydbsmall.db" 1)

  liftIO $ flip runSqlPersistMPool opool $ do
    runMigration migrateAll

  liftIO $ flip runSqlPersistMPool ipool $ do
    runConduit $ selectSource (distinct $ from $ \pays -> return pays) .| (migrate opool)

migrate :: (Monad m, MonadIO m)
        => Pool SqlBackend
        -> ConduitT (Entity Pays) Void m ()
migrate pool = CL.mapM_ (\e -> liftIO $ flip runSqlPersistMPool pool $ insertKey (entityKey e) (entityVal e))

The problem is in the migrate function since I'm calling runSqlPersistMPool on every single data in my conduit stream.

What is the right way to do it ?

duplode
  • 33,731
  • 7
  • 79
  • 150
kolam
  • 731
  • 4
  • 17

0 Answers0