0

I am fetching data from a large table (~1 million entries) with hibernate via scrollable resultset. The issue I am facing now is very strange and right now I don't seem to find the answer.

What I do is load the data with a pretty simple statement, scroll through the resultset and write XML strings into a zip file. I am batching the queries as Hibernate has issues when the amount of parameters within a "WHERE IN - clause" reaches a certain amount. Setting it as regular ParameterList is even worse. Thats why I do it the way below.

Problem now: I am generating file names from the unique key defined in the SQL (business key + type) . During the process I get a zipException which says, that the filename already exists. This should be impossible as the unique key prevents duplicate entries. Is it possible that I somehow wrongly use the scrollableresults ? Is the approach to fetch the data in batches causing this issue ? I am really left clueless here...

CREATE TABLE "lmh_repository"
(
   ID_REPOSITORY decimal(19,0) PRIMARY KEY IDENTITY (1,1) NOT NULL,
   ENTITY_TYPE varchar(255) NOT NULL,
   ENTITY_ID varchar(255) NOT NULL,
   ENTITY_VERSION int DEFAULT ((0)) NOT NULL,
   XML_DATA varbinary(MAX) NOT NULL
);

ALTER TABLE "lmh_repository"
ADD CONSTRAINT UQ_ENTITY_TYPE_ID UNIQUE(ENTITY_TYPE, ENTITY_ID)

Now the code that fetches the data.

 private void fetchXMLDataFromDB(Set<String> idList, ArchiveOutputStream aos, EntityTypeEnum type) {

    // not more than 5000 to avoid hibernate issues in 'WHERE IN - clause'
    Iterable<List<String>> partition = Iterables.partition(idList, BATCH_SIZE);

    StatelessSession session = sessionFactory.openStatelessSession();

    try {

        for (List<String> listPartition : partition) {
            String ids = getWhereInClause(listPartition);

            ScrollableResults scrollableResults = session.createQuery("FROM " + LMHRepository.class.getSimpleName() +
                    " WHERE entityType = :type AND entityId IN (" + ids + ")")
                    .setParameter("type", type)
                    .scroll(ScrollMode.FORWARD_ONLY);

            while (scrollableResults.next()) {

                LMHRepository repoFile = (LMHRepository) scrollableResults.get()[0];
                String fileName = getFileName(repoFile);

                aos.putArchiveEntry(new ZipArchiveEntry(fileName));
                IOUtils.write(repoFile.getXmlStringData(), aos);
                aos.closeArchiveEntry();
            }
        }
    } catch (IOException e) {
        //error handling
    } finally {
        session.close();
    }
m0rb
  • 63
  • 10
  • Are you sure you do not have duplicate in your database? " filename already exists" is filename generated from primary key? – Amer Qarabsa Nov 07 '17 at 10:07
  • As I stated in my description, I generate the filename with ENTITY_ID + ENTITY_TYPE which is defined to be a unique constraint. So no, there are no duplicates in the database. And I also hope that there are no duplicate primary key entries in the resultset...why would this happen? – m0rb Nov 07 '17 at 10:39
  • you should not have any duplicates , the exception is saying that the file already exist, if you run your code twice then you will face this issue, you need to make sure the file does not exist\ – Amer Qarabsa Nov 07 '17 at 12:18
  • As I am iterating through a resultset which was fetched with those two identifiers I expect there not to be any dupplicates. My getFilename() method does no magic than just concatenating those two identifiers. Even if the id-List would contain duplicates, the DBMS takes care of it. So a select * from foo where id in (1,2,1,2,1) would only deliver 2 results with id 1 and 2. – m0rb Nov 07 '17 at 13:20
  • I just found out, that this is seriously connected to using a "WHERE IN" clause. It seems that ScrollableResults are somehow behaving very strangely. I couldn't find anything in the documentation that mentions not to use anything else besides 'select * from table' ... – m0rb Nov 07 '17 at 15:53

0 Answers0