13

I want to be able to fetch all records from a very big table using Slick. If I try to do this through foreach, for or list fetching; I get an Out Of Memory Exception.

Is there any way to use "cursors" with Slick or lazy loading that only fetch the object when needed reducing the amount of memory used?

Adrián
  • 6,135
  • 1
  • 27
  • 49
Octavio Luna
  • 340
  • 4
  • 9
  • Not sure why foreach would result in an OOM, it should only proceed one element at a time. You can instead try elements(), which will return a CloseableIterator. If that also results in an OOM, post the rest of your code. – Saish Mar 21 '13 at 15:47

3 Answers3

5

Not sure what do you mean by cursors, but you can fetch partial data using pagination:

query.drop(0).take(1000) will take the first 1000 records

query.drop(1000).take(1000) will take from 1001 to 2000 lines of the table.

But this query efficiency will depend on your database, if it will support it, if the table is right indexed.

dirceusemighini
  • 1,344
  • 2
  • 16
  • 35
  • 1
    Yeah I thought that, the thing is that I wanted something like a `fetchNext` just to grab the next record in the resultset. It seems that the `foreach` fetch all the results into a list (running out of memory in the process). Maybe I can implement my own function to fetch part of the resultset using the pagination technique. THANKS – Octavio Luna Jan 18 '13 at 01:51
  • Why not use the slick [iterator](http://slick.typesafe.com/doc/1.0.1/api/index.html#scala.slick.util.CloseableIterator)? – matanster Dec 13 '14 at 20:43
  • @matt Sometime in the past, the positionedresult iterator was causing an outofmemory error, but I think that this issue has been solved. Iterator also has the take and drop methods, but I think that you mean using the next and hasnext methods, right? – dirceusemighini Dec 18 '14 at 17:23
1

you could use the combination of iterator which returns an iterator:

 val object = Objects.where(...).map(w => w).iterator()

and a groupby:

val chunkSize = 1000
val groupedObjects = objects.grouped(chunkSize)
groupedObjects.foreach {objects => objects.par.map(h => doJob(h))}

as suggest in this answer

Community
  • 1
  • 1
Mermoz
  • 14,898
  • 17
  • 60
  • 85
0

dirceusemighini's answer is correct. I ran into a similar issue a few days ago due to wrong assumption about Query.list(), so I can give some more context. From Slick reference:

"Queries are executed using methods defined in the Invoker trait (or UnitInvoker for the parameterless versions). There is an implicit conversion from Query, so you can execute any Query directly. The most common usage scenario is reading a complete result set into a strict collection with a specialized method such as list or the generic method to which can build any kind of collection"

It is indeed true that Query.list() loads the complete result set in memory. With this in mind, you can have multiple approaches for your problem.

oVo
  • 111
  • 1
  • 3