0

I have got a problem parsing this file with JAVA SAX parser:

http://feeds.escapeartists.net/PodCastle_Main

Most of the time I get the Exception: content is not allowed in prolog

I viewed the file with Notepad++, the prolog is ok, at least I think so.

A lot of other podcast feeds work, eg. http://feeds.feedburner.com/newz-of-the-world

The interesting thing: the podcastle feed works with a succes rate of about 10%.

Any suggestions ?

br Jürgen

EDIT: interessting, I downloaded the file manually and uploaded it to my own webspace. - from there everything is fine ... strange

EDIT2: code

        URL url = new URL(this.urlString);
        _setProxy(); // Set the proxy if needed 
        urlInputStream = url.openConnection().getInputStream();

        spf = SAXParserFactory.newInstance();
        sp = spf.newSAXParser();

        if ( urlInputStream == null) {
            System.out.println("blub blub");
        }
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));

        System.out.println ("<<<<"+this.urlString+">>>> :" +  in.readLine() );
        System.out.println ("<<<<"+this.urlString+">>>> :" +  in.readLine() );
        in.close();


        InputStream is = url.openStream();

        try {

        sp.parse(url.toURI().toString(), this);

        } catch (SAXParseException e) {
            System.err.println(e.getMessage());
        }

output:

<<<< ttp://feeds.escapeartists.net/PodCastle_Main>>>> : ( not printable chars ) <<<< ttp://feeds.escapeartists.net/PodCastle_Main>>>> : ( not printable chars )

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid Byte 1 of 1-Byte-UTF-8-sequence. ...

similar exception for sax parser...

so the problem is not sax, but the data transmission. it works about half the time. all other testet .xml files work.

does anyone know this effect?

2 Answers2

0

Now I have implemented this workaround. I think something is wrong with the webserver of the posted website

       int TRIES = 10;
        for (int tries = 0; tries < TRIES; tries++) {
            InputStream is = url.openStream();
            try {

                sp.parse(is, this);
                //here succesfull
                tries = TRIES; //break loop

            } catch (SAXParseException e) {
                System.err.println(e.getMessage());
            } catch (MalformedByteSequenceException ex) {
                System.out.println("Connection to " + url.toString() + " failed "+ (tries+1) +" times , trying again... (maximum tries = "+ TRIES +")");
                 Thread.sleep(250);
            }
      }

after 2 or 3 attemps, the stream works

0

Probably the start of your document is not clean - perhaps an extra character or two accidently got in. You should be able to extract the line and column it is objecting to from the SAXException.

codemaniac143
  • 1,241
  • 2
  • 11
  • 18