I am trying to do some web crawling and I came across an issue of when to add a slash or not. I know that some sites do have it at the end and some don't but entering the wrong one in the browser will just redirect you to the right one. Normalization would add the slash at the end but its going to cause a problem when trying to convert the relative URLs to absolute.
For example if a user selects an absolute URL http://stack.com/more
but the actual (redirect) URL is http://stack.com/more/
and a relative url is index.html
Then doing URL newurl = new URL(url, relativeURL);
yields http://stack.com/index.html
(non existant page)
when it should actually be http://stack.com/more/index.html
(real page)
Doese anyone know a good way to correctly append the slash at the end?