0

i have the following code. it fetch data from a table in database.then for each item in that data list it checks a condition and base on that updates a column in that table. i know batch updates will help in such situations. and also async programming will be useful. so i created 10 threads by a ThreadPoolExecutor and did batch updates using them:


    @Inject
    private EntityManager em;


    @Resource
    UserTransaction utx;

   
    public Response determineOwnership() throws SystemException, NotSupportedException, HeuristicRollbackException, HeuristicMixedException, RollbackException {

        List<Object[]> list = em.createNativeQuery("SELECT u.usr_mobile,u.national_code from usr_mobile_verification u").getResultList();
        utx.begin();
        ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(10);
        List<Future<Object>> futures = new ArrayList<>(0);
        AtomicInteger i = new AtomicInteger();
        list.forEach(t -> {
            i.getAndIncrement();
            UpdateCallable callable = null;
            try {
                callable = new UpdateCallable(String.valueOf(t[0]), String.valueOf(t[1]), infoService);
            } catch (NamingException e) {
                e.printStackTrace();
            }
            if ((i.get() % 30) == 0) {
                em.flush();
                em.clear();
            }
            Future<Object> result = executor.submit(callable);
            futures.add(result);
        });
        executor.shutdown();
        utx.commit();
        for (Future f : futures) {
            try {
                f.get();
            } catch (Exception ex) {
                ex.printStackTrace();
                return ResponseHelper.serverError();
            }
        }
        return ResponseHelper.ok("operation finished");
}

the UpdateCallable class:

public class UpdateCallable implements Callable<Object> {

    private String mobile;
    private String nid;
    private EntityManager em =  Persistence.createEntityManagerFactory("primary").createEntityManager();
    private MobileIdentityInfoService infoService;
    private UserTransaction utx = ((UserTransaction) new InitialContext().lookup("java:comp/UserTransaction"));


    public UpdateCallable(String mobile, String nid, MobileIdentityInfoService infoService) throws NamingException {
        this.mobile = mobile;
        this.nid = nid;
        this.infoService = infoService;
    }

    public UpdateCallable() throws NamingException {
    }


    @Override
    public Exception call() throws Exception {
        try {
            try {
                utx.begin();
                em.joinTransaction();
                if (isContactInfoValid(mobile, nid))
                    em.createNativeQuery("update usr_mobile_verification m set m.is_owner='1' where m.national_code=?").setParameter(1, nid).executeUpdate();
                else
                    em.createNativeQuery("update usr_mobile_verification m set m.is_owner='0' where m.national_code=?").setParameter(1, nid).executeUpdate();
            } catch (RegistrationException e) {
                e.printStackTrace();
            }
        } catch (Exception ex) {
            return ex;
        }
        return null;
    }

with this approach i reduced time from 1 hour to 2 minutes!!

but the problem is at the end most of the records in the table don't get updated. i guess same iterations are getting processed by more than one thread and overlapped. what should i do? how can i ensure that iterations are not repeated and thus all records are updated

reza
  • 27
  • 7
  • 1
    Normally a batch means "multiple" updates at once - i.e. 1 query execution with multiple rows to update - key(s) for each row, value(s) for each row. The code doesn't seem to do that?? – Mr R Mar 10 '21 at 21:10
  • the if statement simply tells if the loops reach 30k (k=1,2,...) iteration then execute the query on database. 30 is the batch size – reza Mar 11 '21 at 09:40
  • 1
    This code `em.createNativeQuery("update usr_mobile_verification m set m.is_owner='1' where m.national_code=?").setParameter(1, nid).executeUpdate();` doesn't look like it's doing 30 of anything .. – Mr R Mar 11 '21 at 20:10
  • What are you really trying to do with "f ((i.get() % 30)"? I don't think flush/clear serves any purpose - but you should turn on logging to see what is being executed, when and what effect changes like this really have. Once you execute the select, you are getting all data from the database and can just close that entityManager as your threads are getting new ones entirely. Secondly, are you sure there are ANY changes saved into the database? call utx.begin, but no commit, so I don't see how any of the changes made by update statements get committed. – Chris Mar 12 '21 at 14:58
  • A different approach might get you even more time shaved off - as each and every callable has to connect to the db, start its own transaction, get JPA resources and commit, you could execute the isContactInfoValid in threads but keep the results to save to the database when done. Have the futures return the results, which you can correlate into 2 update statements using "where m.national_code in (?, ?)" logic. You then get 2 statements and one transaction instead of N. Most DBs will allow sending a large number of parameters to an IN clause, so there would be N/1000 statements – Chris Mar 12 '21 at 15:06
  • @Chris as u can see i begin transaction in the second line of ```determineOwnership``` method and commit it after ```executor.shutdown()``` in ```UpdateCallable``` class i wrote another ```utx.begin()``` and also ```em.joinTransaction()``` cos it kept throwing no active transaction is available for this thread. and didn't commit it there cos there was no need to do so. after all i didn't understand much from your comments. would be better if u show me by code. thanks in advance – reza Mar 12 '21 at 16:51
  • You are calling utx.begin() without an associated utx.commit() - that is why you are missing changes. Nothing within that utx context will get saved in the database without a commit. As for the rest - it is just a suggestion. You'll need to learn about batch updates and database resources to really understand my proposal and how it might help. – Chris Mar 12 '21 at 22:32

0 Answers0