Grails CSV Plugin - Concurrency

Question

I am using the plugin: Grails CSV Plugin in my application with Grails 2.5.3. I need to implement the concurrency functionality with for example: GPars, but I don't know how I can do it.

Now, the configuration is sequential processing. Example of my code fragment:

Thanks.

score 1 · Answer 1 · answered May 09 '16 at 19:39

1

Implementing concurrency in this case may not give you much of a benefit. It really depends on where the bottleneck is. For example, if the bottleneck is in reading the CSV file, then there would be little advantage because the file can only be read in sequential order. With that out of the way, here's the simplest example I could come up with:

import groovyx.gpars.GParsPool

def tokens = csvFileLoad.inputStream.toCsvReader(['separatorChar': ';', 'charset': 'UTF-8', 'skipLines': 1]).readAll()
def failedSaves = GParsPool.withPool {
    tokens.parallel
        .map { it[0].trim() }
        .filter { !Department.findByName(it) }
        .map { new Department(name: it) }
        .map { customImportService.saveRecordCSVDepartment(it) }
        .map { it ? 0 : 1 }
        .sum() 
}

if(failedSaves > 0) transactionStatus.setRollbackOnly()

As you can see, the entire file is read first; hence the main bottleneck. The majority of the processing is done concurrently with the map(), filter(), and sum() methods. At the very end, the transaction is rolled back if any of the Departments failed to save.

Note: I chose to go with a map()-sum() pair instead of using anyParallel() to avoid having to convert the parallel array produced by map() to a regular Groovy collection, perform the anyParallel(), which creates a parallel array and then converts it back to a Groovy collection.

Improvements

As I already mentioned in my example the CSV file is first read completely before the concurrent execution begins. It also attempts to save all of the Department instances, even if one failed to save. You may want that (which is what you demonstrated) or not.

answered May 09 '16 at 19:39

Emmanuel Rosa

9,697
2
14
20

Ok, thanks. I have not copied all code, only the main section. I will try it with this example. PD: When a row fails the execution is stopped, doing a setRollbackOnly also. – Jesús Iglesias May 09 '16 at 20:07
What is the difference between eachParallel, collectParallel and anyParallel? – Jesús Iglesias May 09 '16 at 21:17
They are the concurrent equivalent of Groovy's each, collect, and any. – Emmanuel Rosa May 09 '16 at 21:20
OK, allow me to elaborate. Let's take the methods `collect()`, `collectParallel()` and `map()`. Each of these methods does the same thing: it executes a closure for each item in a list, and builds a new list from the return value of the closure. The difference is that `collect()` iterates through the list in sequential order. `collectParallel()` breaks up the list and hands it off (indirectly) to threads, and then joins the list back together so it looks like the output of `collect()`. `map()` is like `collectParallel()` except it doesn't do the join step... – Emmanuel Rosa May 09 '16 at 22:17
For example, `map { it[0].trim() }` is able to execute the closure and immediately pass the result down to the next step without having to wait to process the entire list. On the other hand, ``collectParallel { it[0].trim() }` must wait for the entire list to process before it can proceed to the next step. – Emmanuel Rosa May 09 '16 at 22:19
I added the full code of my controller and services in sequential version. Also, I did a parallel version but it has side effects sometimes. Please, help me. – Jesús Iglesias May 10 '16 at 17:08
`collectParallel()`, nor `collect()` for that matter, are not supposed to be used in that way. Why did you chose to not follow the lead I set in my example? – Emmanuel Rosa May 10 '16 at 17:29
You can see my code and you can check all checks that I do. Your example is too simple, besides it appears the not session found error in threads. I need cover all use cases. – Jesús Iglesias May 10 '16 at 17:46
I never intended my code to be used as-is. What I'm referring to is doing the process, including validation, as steps, using `map()`, `filter()` and friends. Yes, you'd have to create a session in each step when using GORM. And when needed, you can do some of the steps sequentially. My example is purposely simple. – Emmanuel Rosa May 10 '16 at 17:56
But I don't know to implement it how your example. I need help. – Jesús Iglesias May 10 '16 at 18:08
In your web, I don't see the solution. – Jesús Iglesias May 10 '16 at 19:27

Grails CSV Plugin - Concurrency

1 Answers1

Improvements