I am trying to download a large file (>1GB) from one server to another using HTTP. To do this I am making HTTP range requests in parallel. This lets me download the file in parallel.
When saving to disk I am taking each response stream, opening the same file as a file stream, seeking to the range I want and then writing.
However I find that all but one of my response streams times out. It looks like the disk I/O cannot keep up with the network I/O. However, if I do the same thing but have each thread write to a separate file it works fine.
For reference, here is my code writing to the same file:
int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
{
using (FileStream fileStream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
{
fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
byte[] buffer = new byte[64 * 1024];
int bytesRead;
while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (state.IsStopped)
{
return;
}
fileStream.Write(buffer, 0, bytesRead);
}
}
};
}
catch (Exception e)
{
exception = e;
state.Stop();
}
});
And here is the code writing to multiple files:
int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
{
using (FileStream fileStream = File.Open(fileName + "." + index + ".tmp", FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
{
fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
byte[] buffer = new byte[64 * 1024];
int bytesRead;
while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (state.IsStopped)
{
return;
}
fileStream.Write(buffer, 0, bytesRead);
}
}
};
}
catch (Exception e)
{
exception = e;
state.Stop();
}
});
My question is this, is there some additional checks/actions that C#/Windows takes when writing to a single file from multiple threads that would cause the file I/O to be slower than when writing to multiple files? All disk operations should be bound by the disk speed right? Can anyone explain this behavior?
Thanks in advance!
UPDATE: Here is the error the source server is throwing:
"Unable to write data to the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond." [System.IO.IOException]: "Unable to write data to the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond." InnerException: "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond" Message: "Unable to write data to the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond." StackTrace: " at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)\r\n at System.Net.Security._SslStream.StartWriting(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)\r\n at System.Net.Security._SslStream.ProcessWrite(Byte[] buffer, Int32 offset, Int32 count, AsyncProtocolRequest asyncRequest)\r\n at System.Net.Security.SslStream.Write(Byte[] buffer, Int32 offset, Int32 count)\r\n