0

I use Hibernate Search 5.11 on my Spring Boot 2 application, allowing to make full text research. This librairy require to index documents.

When my app is launched, I try to re-index manually data of an indexed entity (MyEntity.class) each five minutes (for specific reason, due to my server context).

I try to index data of the MyEntity.class.

MyEntity.class has a property attachedFiles, which is an hashset, filled with a join @OneToMany(), with lazy loading mode enabled :

@OneToMany(mappedBy = "myEntity", cascade = CascadeType.ALL, orphanRemoval = true)
private Set<AttachedFile> attachedFiles = new HashSet<>();

I code the required indexing process, but an exception is thrown on "fullTextSession.index(result);" when attachedFiles property of a given entity is filled with one or more items :

org.hibernate.TransientObjectException: The instance was not associated with this session

The debug mode indicates a message like "Unable to load [...]" on entity hashset value in this case.

And if the HashSet is empty (not null, only empty), no exception is thrown.

My indexing method :

private void indexDocumentsByEntityIds(List<Long> ids) {

final int BATCH_SIZE = 128;

Session session = entityManager.unwrap(Session.class);

FullTextSession fullTextSession = Search.getFullTextSession(session);
fullTextSession.setFlushMode(FlushMode.MANUAL);
fullTextSession.setCacheMode(CacheMode.IGNORE);

CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<MyEntity> criteria = builder.createQuery(MyEntity.class);
Root<MyEntity> root = criteria.from(MyEntity.class);
criteria.select(root).where(root.get("id").in(ids));

TypedQuery<MyEntity> query = fullTextSession.createQuery(criteria);

List<MyEntity> results = query.getResultList();

int index = 0;

for (MyEntity result : results) {
    index++;
    try {
        fullTextSession.index(result); //index each element
        if (index % BATCH_SIZE == 0 || index == ids.size()) {
            fullTextSession.flushToIndexes(); //apply changes to indexes
            fullTextSession.clear(); //free memory since the queue is processed
        }
    } catch (TransientObjectException toEx) {
        LOGGER.info(toEx.getMessage());
        throw toEx;
    }
}
}

Does someone have an idea ?

Thanks !

j.2bb
  • 881
  • 3
  • 10
  • 17

2 Answers2

1

This is probably caused by the "clear" call you have in your loop.

In essence, what you're doing is:

  • load all entities to reindex into the session
  • index one batch of entities
  • remove all entities from the session (fullTextSession.clear())
  • try to index the next batch of entities, even though they are not in the session anymore... ?

What you need to do is to only load each batch of entities after the session clearing, so that you're sure they are still in the session when you index them.

There's an example of how to do this in the documentation, using a scroll and an appropriate batch size: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#search-batchindex-flushtoindexes

Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.

yrodiere
  • 9,280
  • 1
  • 13
  • 35
0

Thanks for the explanations @yrodiere, they helped me a lot !

I chose your alternative solution :

Alternatively, you can just split your ID list in smaller lists of 128 elements, and for each of these lists, run a query to get the corresponding entities, reindex all these 128 entities, then flush and clear.

...and everything works perfectly !

Well seen !

See the code solution below :

private List<List<Object>> splitList(List<Object> list, int subListSize) {

List<List<Object>> splittedList = new ArrayList<>();

if (!CollectionUtils.isEmpty(list)) {

    int i = 0;
    int nbItems = list.size();

    while (i < nbItems) {
        int maxLastSubListIndex = i + subListSize;
        int lastSubListIndex = (maxLastSubListIndex > nbItems) ? nbItems : maxLastSubListIndex;
        List<Object> subList = list.subList(i, lastSubListIndex);
        splittedList.add(subList);
        i = lastSubListIndex;
    }
}

return splittedList;
}


private void indexDocumentsByEntityIds(Class<Object> clazz, String entityIdPropertyName, List<Object> ids) {

Session session = entityManager.unwrap(Session.class);

List<List<Object>> splittedIdsLists = splitList(ids, 128);

for (List<Object> splittedIds : splittedIdsLists) {

    FullTextSession fullTextSession = Search.getFullTextSession(session);
    fullTextSession.setFlushMode(FlushMode.MANUAL);
    fullTextSession.setCacheMode(CacheMode.IGNORE);

    Transaction transaction = fullTextSession.beginTransaction();

    CriteriaBuilder builder = session.getCriteriaBuilder();
    CriteriaQuery<Object> criteria = builder.createQuery(clazz);
    Root<Object> root = criteria.from(clazz);
    criteria.select(root).where(root.get(entityIdPropertyName).in(splittedIds));

    TypedQuery<Object> query = fullTextSession.createQuery(criteria);

    List<Object> results = query.getResultList();

    int index = 0;

    for (Object result : results) {
        index++;
        try {
            fullTextSession.index(result); //index each element
            if (index == splittedIds.size()) {
                fullTextSession.flushToIndexes(); //apply changes to indexes
                fullTextSession.clear(); //free memory since the queue is processed
            }
        } catch (TransientObjectException toEx) {
            LOGGER.info(toEx.getMessage());
            throw toEx;
        }
    }

    transaction.commit();
}
}
j.2bb
  • 881
  • 3
  • 10
  • 17