I'm currently parsing the html content from http://www.reddit.com/r/funny/new/ via Java.
Therefore I tried multiple HTTP-Get Libarys:
- Java Standard Library (i.e. HttpURLConnection)
- Jsoup
- Apache HttpClient library
They all worked but the output wasn't always up-to-date. For example: The http-get request was running in a loop in my Java application and printed the title of the first post. In the meantime I went to the site using my browser and updated the site from time to time(by hitting F5). The browser showed new entries which were recognized by the Java-Program about 20-50 seconds delayed.
How comes that delay and how can I prevent it?