I am trying first to grab all link in a web page by applying the script below: and then i want to use these links again. but since decoding does not always work and this result in an improper link, and i receive 404 error.
Document doc = Jsoup.connect(doi_con).ignoreContentType(true).get();
Elements links = doc.select("a[href]");
for (Element link : links) {
String url = link.absUrl("href");
//byte[] decodeds1= DatatypeConverter.parseBase64Binary(url);
//dec_url = DatatypeConverter.printBase64Binary(decodeds1);
dec_url = java.net.URLDecoder.decode(url, "UTF-8");
}
Within this code, decoding part seems work for some urls. What i got as samples are below:
http://link.springer.com/signup-login?previousUrl=/article/10.1007%2Fs10899-005-5558-2
http://link.springer.com/article/10.1007/s10899-005-5558-2#kb-nav--main
As seen for the first link decoding did not work while for the later it worked.
What am i missing? I also tried parseBase64Binary and printBase64Binary as seen in the code above but again it did not work.
Thanks in advance!