10

I need to programmatically download a large file before processing it. What's the best way to do that? As the file is large, I want to specific time to wait so that I can forcefully exit.

I know of WebClient.DownloadFile(). But there does not seem a way to specific an amount of time to wait so as to forcefully exit.

try
{
    WebClient client = new WebClient();
    Uri uri = new Uri(inputFileUrl);
    client.DownloadFile(uri, outputFile);
}
catch (Exception ex)
{
    throw;
}

Another way is to use a command line utility (wget) to download the file and fire the command using ProcessStartInfo and use Process' WaitForExit(int ms) to forcefully exit.

ProcessStartInfo startInfo = new ProcessStartInfo();
//set startInfo object

try
{
    using (Process exeProcess = Process.Start(startInfo))
    {
        //wait for time specified
        exeProcess.WaitForExit(1000 * 60 * 60);//wait till 1m

        //check if process has exited
        if (!exeProcess.HasExited)
        {
            //kill process and throw ex
            exeProcess.Kill();
            throw new ApplicationException("Downloading timed out");
        }
    }
}
catch (Exception ex)
{
    throw;
}

Is there a better way? Please help. Thanks.

hIpPy
  • 4,649
  • 6
  • 51
  • 65

4 Answers4

25

Use a WebRequest and get the response stream. Then read from the reponse Stream blocks of bytes, and write each block to the destination file. This way you can control when to stop if the download takes too long, as you get control between chunks and you can decide if the download has timed out based on a clock:       

        DateTime startTime = DateTime.UtcNow;
        WebRequest request = WebRequest.Create("http://www.example.com/largefile");
        WebResponse response = request.GetResponse();
        using (Stream responseStream = response.GetResponseStream()) {
            using (Stream fileStream = File.OpenWrite(@"c:\temp\largefile")) { 
                byte[] buffer = new byte[4096];
                int bytesRead = responseStream.Read(buffer, 0, 4096);
                while (bytesRead > 0) {       
                    fileStream.Write(buffer, 0, bytesRead);
                    DateTime nowTime = DateTime.UtcNow;
                    if ((nowTime - startTime).TotalMinutes > 5) {
                        throw new ApplicationException(
                            "Download timed out");
                    }
                    bytesRead = responseStream.Read(buffer, 0, 4096);
                }
            }
        }
Qubei
  • 1,053
  • 1
  • 10
  • 19
Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • @orip, how is it complicated? – juan Feb 16 '10 at 00:07
  • @Juan, For one it is synchronous. The asynchronous version of this example would look very different. But also it throws out the very user-friendly WebClient facade that hides the stream management stuff that is largely irrelevant 90% of the time. – Josh Feb 16 '10 at 00:15
  • 1
    orip, your code is much simpler. one advantage of using Remus' code is that I can how much portion of the file is downloaded. – hIpPy Feb 16 '10 at 19:59
  • @hlpPy: if you preffer the WebClient.DownloadFileAsync/CancelAsync, you could use the WebClient.DownloadProgressChanged event to know the progress. – Remus Rusanu Feb 16 '10 at 20:20
8

How about using DownloadFileAsync in the WebClient class. The cool thing about going this route is that you can cancel the operation by calling CancelAsync if it takes too long. Basically, call this method, and if a specified amount of time elapses, call Cancel.

BFree
  • 102,548
  • 21
  • 159
  • 201
  • If a part of the file has been downloaded does CancelAsync retain that part or delete it? – aks Jun 19 '21 at 12:13
3

Asked here: C#: Downloading a URL with timeout

Simplest solution:

public string GetRequest(Uri uri, int timeoutMilliseconds)
{
    var request = System.Net.WebRequest.Create(uri);
    request.Timeout = timeoutMilliseconds;
    using (var response = request.GetResponse())
    using (var stream = response.GetResponseStream())
    using (var reader = new System.IO.StreamReader(stream))
    {
        return reader.ReadToEnd();
    }
}

Better (more flexible) solution is this answer to the same question, in the form of a WebClientWithTimeout helper class.

Community
  • 1
  • 1
orip
  • 73,323
  • 21
  • 116
  • 148
  • 4
    The webrequest.timeout only measures the time until the HTTP response headers are received, not the total time until the response body is downloaded. Ie. it affects the time until GetResponse returns. – Remus Rusanu Feb 16 '10 at 00:08
2

You can use DownloadFileAsync as @BFree said and then try with the following WebClient's events

protected virtual void OnDownloadProgressChanged(DownloadProgressChangedEventArgs e);
protected virtual void OnDownloadFileCompleted(AsyncCompletedEventArgs e);

Then you can know the Progress Percentage

e.ProgressPercentage

Hope this helps