0

I need to update a single property on a large set of data in Oracle database with grails 2.5

Right now my code looks similarly to this:

    List booksToUpdate = []
    boolean moreBooks = true
    int offset = 0

    while (moreBooks) {
        def fetchedBooks = []
        if('Europe'){
            fetchedBooks = Book.findAllByAuthorInList(getEuropeanAuthors(),
                    [max: BATCHSIZE, offset: offset])
        } else {
            fetchedBooks = Book.findAllByAuthorInListAndPublishYear(getEnglishAuthors(), '1999',
                    [max: BATCHSIZE, offset: offset])
        }

        booksToUpdate.addAll(fetchedBooks)

        moreBooks = fetchedBooks.size() == BATCHSIZE
        offset += BATCHSIZE
    }

    booksToUpdate.each { book ->
        book.copyright = '2020'
        book.save(flush: true, failOnError: true)
    }

I would like to batch my updates for performance. Also, findAll query differs very slightly and it would be nice to build search criteria condifitionally. Ideally I want something like this:

    while (moreBooks) {
        def fetchedBooks = []

        def criteria = new DetachedCriteria(Book)
        criteria.build [max: BATCHSIZE, offset: offset] {
            List relevantAuthors = []
            if('Europe') {
                relevantAuthors = getEuropeanAuthors()
                eq 'publishYear', '1999'
            } else {
                relevantAuthors = getEnglishAuthors()
            }
            inList 'author', relevantAuthors
        }
        criteria.updateAll(copyright:'2020') //batch update

        moreBooks = fetchedBooks.size() == BATCHSIZE
        offset += BATCHSIZE
    }

Is there a way to do this? Doesn't have to be with DetachedCriteria. I looked at the guide but I can't find anything about passing max and offset. Is there a better way to make the code more elegant without compromising its performance?

  • I think according to groovy truth this `if('Europe')` will always evaluate to true and consequently the code block inside else won't be executed. Maybe you need to compare the value with the value of a variable https://groovy-lang.org/semantics.html#Groovy-Truth – Mario Jun 17 '20 at 21:09
  • This is pseudo-code `if('Europe')` represents some, non-hardcoded condition – iKnowNothing Jun 17 '20 at 22:43
  • ok, I understand – Mario Jun 17 '20 at 23:55

1 Answers1

0

updateAll updates everything, even if default max and offset are set. And I realized I need to get a list of updated instances back. So trying to force everything into a single updateAll that returns only count didn't make sense. Ended up with this:

List updateCopyright(String geo) {
    DetachedCriteria<Book> findCr = getQueryFromGeo(geo)
    
    List<String> updatedBookIds = []
    boolean moreBooks = true
    int offset = 0
    
    while (moreBooks) {
        List<Book> fetchedBooks = findCr.list([max: BATCHSIZE, offset: offset])
        List currentBatchIds = fetchedBooks*.id

        def updateCr = Book.where {
            inList 'id', currentBatchIds
        }
        int updatedCount = updateCr.updateAll(copyright: '2020')

        moreBooks = updatedCount == BATCHSIZE
        offset += BATCHSIZE

        updatedBookIds.addAll(currentBatchIds)
    }

    updatedBookIds
}


private DetachedCriteria<Book> getQueryFromGeo(String geo) {
    switch (geo) {
        case 'Europe':
            return Book.where {
                author in getEuropeanAuthors()
            }
        case 'England':
            return Book.where {
                author in getEnglishAuthors() &&
                        publishYear == '1999'
            }
        default:
            return Book.where {}
    }
}

Also played with building my query condifionally with createCriteria but readability suffered and no compile-time check that comes with where queries