I am using Webharvest to download a file from a website and take its original name.
The Java code that I am working with is:
import org.apache.commons.httpclient.Header;
import org.apache.commons.httpclient.HttpClient;
import org.apache.commons.httpclient.HttpStatus;
import org.apache.commons.httpclient.Header;
import org.apache.commons.httpclient.methods.GetMethod;
HttpClient client = new HttpClient();
BufferedReader br = null;
StringBuffer result = new StringBuffer();
String attachName;
GetMethod method = new GetMethod(attachmentLink.toString());
int returnCode;
returnCode = client.executeMethod(method);
Header[] headers = method.getResponseHeader("Content-Disposition");
attachName = headers[0].getValue();
attachName = new String(attachName.getBytes());
The result in webharvest is:
attachment; filename="Resoluci�n sobre Mesas de Contrataci�n.pdf"
I cant make it take the letter
ó
After I got the value of the header Content-Disposition into variable attachName, I also tried to decode it, but with no luck:
String attachNamef = URLEncoder.encode(attachName, "ISO-8859-1");
attachNamef = URLEncoder.decode(attachNamef, "UTF-8");
I was able to determine that the response charset is: ISO-8859-1
method.getResponseCharSet()
P.S. When I see the headers in Firefox Firebug - the value is ok: Content-Disposition
attachment; filename="Resolución sobre Mesas de Contratación.pdf"