java net URLDecoder decode decoding does not work

Question

I am trying first to grab all link in a web page by applying the script below: and then i want to use these links again. but since decoding does not always work and this result in an improper link, and i receive 404 error.

Document doc = Jsoup.connect(doi_con).ignoreContentType(true).get();

Elements links = doc.select("a[href]");

for (Element link : links) {
    String url = link.absUrl("href");

    //byte[] decodeds1= DatatypeConverter.parseBase64Binary(url);
    //dec_url = DatatypeConverter.printBase64Binary(decodeds1);

    dec_url = java.net.URLDecoder.decode(url, "UTF-8");
}

Within this code, decoding part seems work for some urls. What i got as samples are below:

http://link.springer.com/signup-login?previousUrl=/article/10.1007%2Fs10899-005-5558-2
http://link.springer.com/article/10.1007/s10899-005-5558-2#kb-nav--main

As seen for the first link decoding did not work while for the later it worked.

What am i missing? I also tried parseBase64Binary and printBase64Binary as seen in the code above but again it did not work.

Thanks in advance!

@boraldomaster String url = link.absUrl("href"); Source string is the urls retrieved from related web page... — mlee_jordan, Oct 14 '14 at 12:43
I ask you to provide those urls. If I had those urls I could run this code. — Boris, Oct 14 '14 at 12:54
you can check this url: http://link.springer.com/article/10.1007%2Fs10899-005-5558-2 — mlee_jordan, Oct 14 '14 at 14:10
%25 is decoded to %. Everything is correct. What do expect to receive? — Boris, Oct 14 '14 at 14:59
for the first one article/10.1007%2Fs10899-005-5558-2 for the second article/10.1007/s10899-005-5558-2. Slash could not be converted for the first one. It stays as %2F — mlee_jordan, Oct 14 '14 at 15:14
Suppose we don't understand each other. Let's forget about second link and talk just about first link. It is **...link.springer.com/signup-login?previousUrl=%2Farticle%2F10.1007%252Fs10899-005-5558-2** right? It is converted to **...link.springer.com/signup-login?previousUrl=/article/10.1007%2Fs10899-005-5558-2**, right? What do you want to receive as conversion result for **...link.springer.com/signup-login?previousUrl=%2Farticle%2F10.1007%252Fs10899-005-5558-2** ? — Boris, Oct 15 '14 at 06:12

java net URLDecoder decode decoding does not work

0 Answers0