0

I need to speed up downloading a large file and I'm using multiple connections for that. I'm using a single goroutine with access to disk and it receives data from multiple goroutines using a channel, as I was advised here.

    file, _ := os.Create(filename)
    down.destination = file
    for info := range down.copyInfo {
        down.destination.Seek(info.start, 0)
        io.CopyN(down.destination, info.from, info.length)
        }
    }

The problem is, seeking, when used repeatedly, on a large file, seems to make the operation slower. When info.length is larger, it has to seek less number of times, and it seems to do the job faster. But I need to make info.length smaller. Is there a way to make seeking faster? Or should I just download each part to separate temp files and concatenate them at last?

K1DV5
  • 61
  • 1
  • 6
  • 5
    Is your connection speed higher than sequential write disk speed? – zerkms Nov 15 '20 at 09:13
  • 1
    You seem to be answering your own question. The best way to determine if something makes your program slower, is to isolate it and test yourself (which you have seemingly done). – Hymns For Disco Nov 15 '20 at 09:14
  • this answer seems relevant to your question https://stackoverflow.com/a/34667877/4466350 –  Nov 15 '20 at 09:19
  • @zerkms No. But the writing is not sequential. One write may be for a chunk at the start of the file, the next may be at the end, or anywhere in between. – K1DV5 Nov 15 '20 at 09:57
  • @mh-cbon Thank you. That is the sort of information I needed. – K1DV5 Nov 15 '20 at 09:58

1 Answers1

2

A seek itself does not do any I/O but just sets the position in the file for the next read or write. The number of seeks by themselves thus likely don't matter. This can also be easily be tested by adding dummy seeks without any following read or write operation.

The problem is likely not the number of seeks but the number of write operations. With many small fragments it will need more I/O operations to write the data than with a few large fragments. And each of these I/O operations has a significant overhead. There is the overhead of the system call itself. Then there might be overhead if the fragment is not aligned at the block boundaries of the underlying storage. And it might be overhead with rotating disk to position to the actual sector.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172