0

What's the simplest java way to download a file over HTTPS, preserving timestamps and using the content-disposition for the filename? Is there any java library at a higher level than apache-httpclient?

Currently I've got:

    HttpClient httpClient = new DefaultHttpClient();
    HttpResponse httpResponse = httpClient.execute(new HttpGet(parser.sourceUrl));
    Header cd = httpResponse.getLastHeader("Content-Disposition");
    String filename = cd.getValue().split(";")[1].split("=")[1]; // TODO(jayen): unhack
    HttpEntity httpEntity = httpResponse.getEntity();
    System.out.println("Saving " + filename);
    httpEntity.writeTo(new FileOutputStream(folder.getCanonicalPath() + File.separator + filename));
    if (httpResponse.containsHeader("Last-Modified")) {
        System.err.println("Please implement timestamping");
    } else {
        System.out.println("No timestamp available!");
    }
Jayen
  • 5,653
  • 2
  • 44
  • 65

2 Answers2

0

Neither the HTTP protocol or the multipart content-type specs provide any way of encoding arbitrary source file metadata. The HTTP spec defines a Last-Modified header that can be set by the server, but it is not required to do this. But more importantly, typical browsers do not preserve this timestamp. (Some command line tools do ... but that's a different matter.) The Content-Disposition header is not part of the HTTP 1.1 spec, but a lot of servers support it anyway.

Options:

  • If you are using a Java library to do the fetching, then you should be able to get the "modified" timestamp and content disposition from the Response headers. Refer to the relevant client library's tutorial information.

  • If you can use wget or curl or equivalent, they should be able to preserve the timestamp.

  • You may be able to install a (trusted) plugin that will preserve timestamps; e.g. this one for Firefox: https://addons.mozilla.org/en-us/firefox/addon/preserve-download-modification/. However, I think that it is unlikely that you can do this in untrusted Javascript ... for security reasons.

  • You could change the server to package the file in archive format that can represent the metadata you want to include, and download the file as an archive. Then use the relevant archive extractor command to extract the file. This will allow you to transfer other metadata such as the original owner, access control details and so on.

Alternatively, you could use the Linux / Unix "scp" or "rsync" commands to do the file transfer.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • I'm currently doing your first option. I'm writing a java program and would prefer not to launch shell commands. – Jayen Aug 13 '13 at 07:25
0

Check this Answer to Download the File in Java 11 using HttpClient

The metadata preserved in this answer is the filename as the filename is obtained from the Content-Disposition response header.

Sameer Jadhav
  • 271
  • 4
  • 10
  • how about timestamps, to prevent unnecessary re-downloading? https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/HttpResponse.BodyHandlers.html#ofFileDownload(java.nio.file.Path,java.nio.file.OpenOption...) doesn't mention timestamps, only filename – Jayen May 12 '21 at 00:55