1

My requirement is to encrypt personal identification columns of table. For which I have written a small code which selects the data in batches and insert into new table with few extra columns.

Problem is when i run my code , it works good in begining but stops inserting record in db. It does not print any exception as well.

This is my connection pool config.

private BlockingQueue<EntityManager>  getConnectionPool(int poolSize) throws InterruptedException {
   // List<EntityManager> list = new ArrayList<>();
    BlockingQueue<EntityManager> queue = new ArrayBlockingQueue<>(poolSize);
    int i = poolSize;
    do
    {


        EntityManager entityManager = connectionService.getEm();
        queue.put(entityManager);
        //list.add(entityManager);
        i--;
    }
    while (i != 0);

    return queue;
}

This is the class from where everything starts. It calculates the total number of batches and calls one method for executor service.

public void insertData() throws InterruptedException {

    key = hash(key);
    EntityManager entityManager = connectionService.getEm();
    EntityTransaction entityTransaction = entityManager.getTransaction();
    BlockingQueue<EntityManager> queue = getConnectionPool( 200);
    try {
        int batchSize= 1000;
        BigInteger totalResults = partnerRepository.getCountCustomerLedger(entityManager);

        double totalPages = Math.ceil(totalResults.longValue() / batchSize);

        int maxResult = batchSize;
        CountDownLatch latch = new CountDownLatch(((Double)totalPages).intValue());
        for(int i =1 ; i <= totalPages; i++) {

            int firstResult = (i - 1) * batchSize;
            if (i == totalPages)
            {
                batchSize = totalResults.intValue() - firstResult;

            }
            exectueTask(queue, firstResult, batchSize, latch, i);

        }
        System.out.println("waiting for latch to finish");
        latch.await();
        System.out.println("latch exited");
    }catch (Exception e) {
        e.printStackTrace();
        if (entityTransaction.isActive()) {
            entityTransaction.rollback();
        }
        entityManager.close();
    }
    finally {
        int i = poolSize;
        do
        {
            queue.take().close();
            i--;
        }
        while (i != 0);
    }
    entityManager.close();

}

This calls the executor method

private void exectueTask(BlockingQueue<EntityManager> queue, int firstResult, int batchSize, CountDownLatch latch, int batchNumber) {
    taskExecutor.execute(() -> {
        try {
            try {
                run(queue, firstResult, batchSize, latch, batchNumber);
            } catch (IOException e) {
                e.printStackTrace();
            }
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    });
}

Here I executing queries in batches and inserting data into db

private void run(BlockingQueue<EntityManager> queue, int firstResult, int batchSize, CountDownLatch latch, int batchNumber) throws InterruptedException, IOException {

        logger.info("batchNumber " + batchNumber + " batchNumber called " + " at " + new Date());
        EntityManager entityManager = queue.take();
        logger.info("batchNumber " + batchNumber + " batchNumber took  " + " following time to get entitymanager " + new Date());
        EntityTransaction entityTransaction = entityManager.getTransaction();

        List<CustomerLedger> customerLedgerList = partnerRepository.getAllCustomerLedger(entityManager,firstResult, batchSize);
        //List<Object[]> customerLedgerList = partnerRepository.getAllCustomerLedgerNative(entityManager,firstResult, batchSize);

        entityTransaction.begin();
        for (CustomerLedger old :customerLedgerList) {
            CustomerLedgerNew ledgerNew = new CustomerLedgerNew();
            String customerLedgerJson = objectMapper.writeValueAsString(old);
            ledgerNew = customerLedgerToCustomerLedgerNew(customerLedgerJson);

            ledgerNew.setFirstName(convertToDatabaseColumn(old.getFirstName(),key));
            ledgerNew.setMiddleName(convertToDatabaseColumn(old.getMiddleName(),key));
            ledgerNew.setLastName(convertToDatabaseColumn(old.getLastName(),key));
            ledgerNew.setAddressLine1(convertToDatabaseColumn(old.getAddressLine1(),key));
            ledgerNew.setAddressLine2(convertToDatabaseColumn(old.getAddressLine2(),key));
            ledgerNew.setAddressLine3(convertToDatabaseColumn(old.getAddressLine3(),key));
            ledgerNew.setAddressLine4(convertToDatabaseColumn(old.getAddressLine4(),key));
            ledgerNew.setHomePhone(convertToDatabaseColumn(old.getHomePhone(),key));
            ledgerNew.setWorkPhone(convertToDatabaseColumn(old.getWorkPhone(),key));
            ledgerNew.setEmail1(convertToDatabaseColumn(old.getEmail1(),key));
            ledgerNew.setMobile(convertToDatabaseColumn(old.getMobile(),key));
            ledgerNew.setMobileSha(sha256Hash(old.getMobile()));
            ledgerNew.setMobileChecksum(getMD5Hash(old.getMobile()));
            ledgerNew.setEmailSha(sha256Hash(old.getEmail1()));
            ledgerNew.setEmailChecksum(getMD5Hash(old.getEmail1()));
            //ledgerNew.setChannel(old.getChannel());
            //ledgerNew.setUniqueCustomerId(old.getUniqueCustomerId());
            //ledgerNew.setLastModifiedDate(old.getLastModifiedDate());
            entityManager.persist(ledgerNew);

        }
        //System.out.println("commited");
        logger.info("batchNumber " + batchNumber + " batchNumber started commiting data at   "  + new Date());
        entityTransaction.commit();
    logger.info("batchNumber " + batchNumber + " batchNumber finished commiting data at   "  + new Date());
        queue.put(entityManager);
    latch.countDown();
    logger.info("batchNumber " + batchNumber + " latch count   "  + latch.getCount());
}

what I noticed from the logs , at one point It only print the logs

** batchNumber 615 batchNumber started commiting data at Wed Dec 11 17:22:54 IST 201** but does not print the next line logs of commiting data. I am really unable to get this reason.

Thread pool config class

@Configuration
public class ThreadPoolConfiguration {

private final static org.slf4j.Logger LOGGER = org.slf4j.LoggerFactory.getLogger(ThreadPoolConfiguration.class);
private final int defaultCorePoolSize = 200;
private final int defaultMaxPoolSize = 300;
private final int defaultQueueCapacity = 20000;
private final int defaultKeepAlive = 10;


@Bean
@Qualifier("TaskExecutor")
public TaskExecutor taskExecutor() {
    ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(defaultCorePoolSize);
    executor.setMaxPoolSize(defaultMaxPoolSize);
    executor.setQueueCapacity(defaultQueueCapacity);
    executor.setKeepAliveSeconds(defaultKeepAlive);
    executor.setAllowCoreThreadTimeOut(true);
    executor.setWaitForTasksToCompleteOnShutdown(true);
    executor.setThreadNamePrefix("encryption-DEFAULT-");
    executor.initialize();
    return executor;
}
}

Please forgive me If I am unable to frame it properly

Muddassir Rahman
  • 976
  • 1
  • 9
  • 20

1 Answers1

0

There are multiple possible sources for your problem: * MySQL table locking deadlock * MySQL running out of connections (check MySQL logs) * Out-of-memory situations due to overfull EntityManager buffers * Deadlock in EntityManager

Since your problem arises when you call entityTransaction.commit(), I'd presume you have a problem with table locking deadlocks. Take a look at this article to analyse posible MySQL deadlock problems.

Your approach looks to me as if you've been working for quite some time on the performance this batch update. Working with multiple threas will give the database a hard time doing table/record locking while not gaining much performance.

I'd recommend doing big batches of work not with an entity manager, but JDBC.

If you have to use JPA, better optimize on the batch size. Take a look at The best way to do batch processing with JPA and Hibernate to get more inspirations.

s.fuhrm
  • 438
  • 4
  • 9