First up, I have a large Sqlite database. I want to copy a subset of the data in this database to another Sqlite database in a streaming fashion using the Haskell persistent library.
Here's the only I managed to do it, but its very slow (and uses more memory) compared to loading the data in memory and then writing it to the new database.
import Database.Persist.Sqlite (SqliteConf(..), runSqlite, runMigration, runSqlPersistMPool)
import Database.Persist (entityVal, Entity, createPoolConfig)
import Data.Pool (Pool)
import Data.Conduit (ConduitT, (.|), runConduit)
import qualified Data.Conduit.List as CL
main :: RIO App ()
main = do
ipool <- liftIO $ createPoolConfig (SqliteConf "mydb.db" 1)
opool <- liftIO $ createPoolConfig (SqliteConf "mydbsmall.db" 1)
liftIO $ flip runSqlPersistMPool opool $ do
runMigration migrateAll
liftIO $ flip runSqlPersistMPool ipool $ do
runConduit $ selectSource (distinct $ from $ \pays -> return pays) .| (migrate opool)
migrate :: (Monad m, MonadIO m)
=> Pool SqlBackend
-> ConduitT (Entity Pays) Void m ()
migrate pool = CL.mapM_ (\e -> liftIO $ flip runSqlPersistMPool pool $ insertKey (entityKey e) (entityVal e))
The problem is in the migrate
function since I'm calling runSqlPersistMPool
on every single data in my conduit stream.
What is the right way to do it ?