1

I'm trying to instantiate tasks in a ExecutorService that need to write to file in order,so if there exist 33 tasks they need to write in order...

I've tried to use LinkedBlockingQueue and ReentrantLock to guarantee the order but by what I'm understanding in fair mode it unlock to the youngest of the x threads ExecutorService have created.

private final static Integer cores =      Runtime.getRuntime().availableProcessors();
private final ReentrantLock lock = new ReentrantLock(false);
private final ExecutorService taskExecutor;

In constructor

taskExecutor = new ThreadPoolExecutor
        (cores, cores, 1, TimeUnit.MINUTES, new LinkedBlockingQueue());

and so I process a quota of a input file peer task

if(s.isConverting()){
   if(fileLineNumber%quote > 0) tasks = (fileLineNumber/quote)+1;
   else tasks = (fileLineNumber/quote);
   for(int i = 0 ; i<tasks || i<1 ; i++){
      taskExecutor.execute(new ConversorProcessor(lock,this,i));
   }
}

the task do

public void run() {
    getFileQuote();
    resetAccumulators();
    process();
    writeResult();
}

and my problem ocurre here:

private void writeResult() {
    lock.lock();
    try {
        BufferedWriter bw = new BufferedWriter(new FileWriter("/tmp/conversion.txt",true));
        Integer index = -1;
        if(i == 0){
            bw.write("ano dia tmin tmax tmed umid vento_vel rad prec\n");
        }
        while(index++ < getResult().size()-1){
            bw.write(getResult().get(index) + "\n");
        }
        if(i == controller.getTasksNumber()){ 
            bw.write(getResult().get(getResult().size()-1));
        }
        else{ 
            bw.write(getResult().get(getResult().size()-1) + "\n");
        }
        bw.close();
    } catch (IOException ex) {
        Logger.getLogger(ConversorProcessor.class.getName()).log(Level.SEVERE, null, ex);
    } finally { 
        lock.unlock(); 
    }

}
RomuloPBenedetti
  • 163
  • 2
  • 12
  • If you need to write things in order, why do you need multiple threads? Why not use a single threaded executor? – vanza Sep 28 '14 at 00:18
  • @vanza: because he still needs concurrency with most of his code except the write to file code. – Hovercraft Full Of Eels Sep 28 '14 at 01:09
  • That's a much longer discussion though. Disk is generally much slower, so unless he's doing a whole lot of computation, multiple threads won't help (Amdahl etc etc). He can still use a separate, single threaded executor just for the I/O part if he really wants to. – vanza Sep 28 '14 at 01:33
  • (Oh, and if his concurrent tasks are actually accessing the disk during the computation, they'll probably hurt more than help due to seeks.) – vanza Sep 28 '14 at 01:41
  • @vanza all data is obtained in getFileQuote, each thread get a bunch of sequencial lines from the archive and work with it, so for 33 tasks the file is readed 33 times to get a sequential part of it. anyway I can't work in all scenarios with the complete file, I'm interested in being able of converting files on GB size in systems with not so abundant RAM. – RomuloPBenedetti Sep 28 '14 at 02:14
  • As counter intuitive as it may seem, if your file is on a single disk and your computation is trivial compared to the I/O costs, your code will probably perform better single-threaded. But if you really want to use multiple threads, use Hovercraft's solution, or write to multiple output files and build the final result in a later stage. – vanza Sep 28 '14 at 02:33

1 Answers1

0

It appears to me that everything needs to be done concurrently except the writing of the output to file, and this must be done in the object creation order.

I would take the code that writes to the file, the writeResult() method, out of your threading code, and instead create Futures that returned Strings that are created by the process() method, and load the Futures into an ArrayList<Future<String>>. You then could iterate through the ArrayList, in a for loop calling get() on each Future, and writing the result to your text file with your BufferedWriter or PrintWriter.

Hovercraft Full Of Eels
  • 283,665
  • 25
  • 256
  • 373
  • but this would increase memory footprint, writing directly from the thread did not maintain all result in memory – RomuloPBenedetti Sep 28 '14 at 02:08
  • If you don't want to maintain all results in memory, you're gonna have to spill them to disk (in separate files) and then assemble the final file from the intermediate ones in the order you need. – vanza Sep 28 '14 at 02:18
  • @user2884025: No, it would not increase the memory foot print at all. Even if it were written to file in thread, since you have to write all the data in order, you'd still have to hold the data in memory while awaiting the completion of writing from the threads that were started earlier. – Hovercraft Full Of Eels Sep 28 '14 at 12:05
  • @HovercraftFullOfEels Correct-me if Im wrong, but writing in threads I will only have thread result memory parcels, otherwise I will need all tasks memory... In my scenario I have only 4 threads in my machine and it can passe from 30 tasks. – RomuloPBenedetti Sep 28 '14 at 15:43
  • I used @HovercraftFullOfEels idea of using Futures so I could write inside threads,but using Futures to lock next thread so it wait to write in correct time I will further analyze alternatives if needed but in the moment the answer solved my problem. Thanks! – RomuloPBenedetti Sep 30 '14 at 00:47