1

I am making a download manager and I want multiple threads downloading different segments of file to write to file at different places at a time. Just for every one's clarification I dont want file to lock because it will kill the purpose of different threads writing at a time. I am using Apache HttpClient library and FileChannel transferFrom(). Current code only downloads the first segment and simply ignores other segments.

Code Explanation: The startDownload method creates a new file and checks if link support partial content, If it does it starts threads for each segment otherwise a single thread will download the whole file.The getFileName is function for extracting file name from URI. The Download method contains the code which actually downloads the file using HttpClient and then writes it using transferFrom.

    public void startDownload() {
    Thread thread = new Thread(() -> {
        try {
            String downloadDirectory = "/home/muhammad/";
            URI uri = new URI("http://94.23.204.158/JDownloader.zip");
            int segments = 2;
            // Create a HttpClient for checking file for segmentation.
            CloseableHttpClient Checkingclient = HttpClients.createDefault();
            // get request for checking size of file.
            HttpGet checkingGet = new HttpGet(uri);
            CloseableHttpResponse checkingResponse = Checkingclient.execute(checkingGet);
            long sizeofFile = checkingResponse.getEntity().getContentLength();
            // Create a new file in downloadDirectory with name extracted from uri.
            File file = new File(downloadDirectory + getFileName(uri));
            if (!file.exists()) {
                file.createNewFile();
            }
            // set range header for checking server support for partial content.
            checkingGet.setHeader("Range", "bytes=" + 0 + "-" + 1);
            checkingResponse = Checkingclient.execute(checkingGet);
            // Check if response code is 206 (partial content response code).
            if (checkingResponse.getStatusLine().getStatusCode() == 206) {
                //find size of each segment.
                final long sizeOfEachSegment = sizeofFile / segments;
                //Download each segment independently.
                for (int i = 0; i < segments; i++) {
                    Download(i * sizeOfEachSegment, (i + 1) * sizeOfEachSegment, sizeOfEachSegment, file, uri);
                }
                // Thread used for last few Bytes and EOF.
                Download(sizeOfEachSegment * segments, sizeofFile, Long.MAX_VALUE, file, uri);
            } else {
                System.err.println("server dont support partial content");
                System.out.println(checkingResponse.getStatusLine().getStatusCode());
                // Download complete file using single thread.
                Download(0, 0, Long.MAX_VALUE, file, uri);
            }
        } catch (IOException | URISyntaxException ex) {
            Logger.getLogger(Downloader.class.getName()).log(Level.SEVERE, null, ex);
        }
    });
    thread.start();
}
public void Download(long start, long end, long sizeOfEachSegment, File file, URI uri) {
    Thread thread = new Thread(() -> {
        try {
            FileChannel fileChannel = new FileOutputStream(file).getChannel();
            CloseableHttpClient client = HttpClients.createDefault();
            HttpGet get = new HttpGet(uri);
            // Range header for defining which segment of file we want to receive.
            if (end != 0) {
                String byteRange = start + "-" + end;
                get.setHeader("Range", "bytes=" + byteRange);
            }
            CloseableHttpResponse response = client.execute(get);
            ReadableByteChannel inputChannel = Channels.newChannel(response.getEntity().getContent());
            fileChannel.transferFrom(inputChannel, start, sizeOfEachSegment);
            response.close();
            client.close();
            fileChannel.close();
        } catch (IOException | IllegalStateException exception) {
            Logger.getLogger(Downloader.class.getName()).log(Level.SEVERE, null, exception);
        }
    });
    thread.start();
}

Some fix to existing code that can make multiple threads to write to same file at same time without waiting will be nice but I am also interested in studying other more efficient techniques if they can do the above task. If in any case writing to a file without waiting is impossible then any other efficient solution is more then welcome. Thanks in advance :)

  • If you download from *one* source, I can't see any advantage of doing the download in parallel. Networking will probably be much slower than local IO. –  Jul 23 '14 at 19:25
  • @Tichodroma: Thanks for commenting you are absolutely right can you point out any methods using which data read from different threads gets written right away on a single without creating multiple files or waiting. – Mohsin Niazi Jul 23 '14 at 19:34

2 Answers2

4

Instead of multiple threads writing to the same file, you could have one thread writing to the file, with multiple threads producing the data but storing it in some kind of buffer for the file writer thread.

NESPowerGlove
  • 5,496
  • 17
  • 28
  • Thanks NESPowerGlove can you guide me a how to do that in this situation. – Mohsin Niazi Jul 23 '14 at 19:35
  • Create a BlockingQueue, pass it to the FileWriter thread, as well to each of the data producing threads. Check online for the Producer Consumer Problem, that's what you have. You can make this pretty easy using an ExecutorService. – NESPowerGlove Jul 23 '14 at 19:40
  • Thanks searching for it, and Just to kill my curiosity is there any possible way to write to a single file from multiple threads directly. I cannot understand why java is not allowing me to open multiple output streams to a same file and write to different non-overlapping places in parallel. – Mohsin Niazi Jul 23 '14 at 19:47
  • I believe that is because you are restricted IO wise. Your hard drive disk can only write one thing at a time. Or are you talking about an exception complaining that the file is being used already for writing? – NESPowerGlove Jul 23 '14 at 19:51
  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. – nietonfir Jul 23 '14 at 20:12
  • @nietonfir I think it does. He asks for other solutions that can handle what he's trying to do efficiently in his question, in addition, one can sort of see that multiple producer threads, while not writing to the file itself, are essentially in the end part of the process of writing to the file together (while not introducing locking at the file writing level as requested). – NESPowerGlove Jul 23 '14 at 20:16
2

Writing to the same file from different threads is not going to help you at all - it would probably even dramatically harm throughput.

You should use one thread to write to the file and feed it from a queue.

Something like:

class WriteBlock {
    long offset;
    byte[] data;
}
BlockingQueue<WriteBlock> writeQueue = new LinkedBlockingQueue<>();

Now each downloading thread should read a block from the download, create a WriteBlock and post it into the queue.

Meanwhile the writing thread sucks WriteBlocks out of the queue and writes them as fast as it can.

There may be optimizations to resequence the blocks while in the queue (perhaps with a PriorityBlockingQueue) but do it the simple way first.

OldCurmudgeon
  • 64,482
  • 16
  • 119
  • 213
  • Now I have data in queue how can I write this data to file using file Channel transferFrom function. transferFrom requires a readableByteChannel any ideas how to get a readableByteChannel or outputStream for a queue? – Mohsin Niazi Jul 24 '14 at 11:02
  • @user3804236 - You can wrap the `byte[]` in a `ByteArrayInputStream` and then use `Channels.newChannel(InputStream)`. – OldCurmudgeon Jul 24 '14 at 11:13
  • @oldcurmudgeon-That will give me a channel with data in it how can I get offset for writing data in file? I am really confused here If only you can share a code snippet for making fileChannel work Thanks :) – Mohsin Niazi Jul 24 '14 at 13:25
  • @user3804236 - The offset to write from should be in the `offset` field of the `WriteBlock`. The downloading thread should fill that in for the write thread. – OldCurmudgeon Jul 24 '14 at 13:35