0

I have a cron job schedule in my Ubuntu or Runnable JAR running.

I have received an exception on my string URL but on my Windows development it is OK. it could not resolve the correct string url please check below

java.io.FileNotFoundException: http://www.carlolotti.com/enoteca-vini-valdostani/chardonnay-elevé-en-fut-de-chene-anselmet-2007-6s0jwecq.asp

here is the source code:

            HttpURLConnection conn; 
        URL obj = new URL(url);
        conn = (HttpURLConnection) obj.openConnection();

        // default is GET
        conn.setRequestMethod("GET");
        conn.setUseCaches(true);

        // act like a browser
        conn.setRequestProperty("User-Agent", USER_AGENT);
        conn.setRequestProperty("Accept",
            "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");

        int responseCode = conn.getResponseCode();

        if(debug)
            System.out.println("\nSending 'GET' request to URL : " + url);
        if(debug)
            System.out.println("Response Code : " + responseCode);
        try{
            BufferedReader in = 
                    new BufferedReader(new InputStreamReader(conn.getInputStream() , "UTF-8"));
            String inputLine;
            StringBuffer response = new StringBuffer();
            while ((inputLine = in.readLine()) != null) {
                //inputLine=StringEscapeUtils.escapeHtml3(inputLine);
                //if(inputLine.contains("Albari"))
                //  t=1;
                response.append(inputLine);
                if(csv)
                    response.append(lineRet);
            }
            in.close();

            return response;
        }catch(Exception e){
            e.printStackTrace();
        }

        return null;

I'm suspecting the Language of my Ubuntu on my Locale LANG=en_GB.UTF-8 Do I need to change this to en_US.UTF-8? I'm not really sure about it but this is my 1st investigation.

Tiborsio_
  • 63
  • 1
  • 13
  • It looks like something is interpreted as ISO-8859-1 when it should have been interpreted as UTF-8. But without the source code and an explanation about where the string is coming from, it will be hard to help you. – RealSkeptic Nov 27 '15 at 10:02
  • you are correct. I added my source code. – Tiborsio_ Nov 27 '15 at 10:08
  • And where in this source does the error occur? And what is the source of the URL string? – RealSkeptic Nov 27 '15 at 10:11
  • Where is `url` comming from? As @RealSkeptic mention it is a convertion problem. In the URL you posted the part `-elevé-` actually should be `-elevé-`. The conversatioin of the UTF8 encoded `é` lead to `é` on ISO-8859-1. – SubOptimal Nov 27 '15 at 10:12
  • It came from parsing on websites using jsoup parsing. I wonder on my windows its ok. it's a String – Tiborsio_ Nov 27 '15 at 10:22

1 Answers1

0

You have to replace the éin the url with é.

Christine
  • 5,617
  • 4
  • 38
  • 61
  • i can't manually edit it because the string is dynamic assertion. maybe i could need conversion of strings. I tried conversion but its not working – Tiborsio_ Dec 01 '15 at 03:05