0

I have a curious problem, respectively a weird effect of using my self-programmed android App.

My app reads out the HTML-source-code of a website and parse it for my desired information. And it work... oh well, not really consistent.

Scenario 1: I use my WLan at home and run my app -> All is working fine. All desired items can be seen in my ListView

Scenario 2: I use my mobile Internet, like Edge or HSDPA -> My ListView is only presenting 1 Item. All of the others are vanished...

I don' t know why. Could there be any time-out, that detain the app to read out the whole HTML-site? But all of the other items would directly follow in the next line of the HTML-source-code...

I have no idea how could I fix it. On google I didn' t find anyone else with the same problem.

Regards, Julian

Here is some code

    // With this I get the HTML-source-code
URL url = new URL("http://www.area4.de);
URLConnection conn = url.openConnection();
DataInputStream dataIn = new DataInputStream(conn.getInputStream());
BufferedReader reader = new BufferedReader(new InputStreamReader(dataIn, "UTF-8"));
String line;

// Then I parse the code with 
while ((line=reader.readLine()) != null)
{
   if (line.contains(searchPattern))
       al.add(line); //al is an ArrayList
}

That was all I do in my app till now (besides presenting the arrayList in a ListView). The source code of the site you can see in your browser (Ctrl + u). I search for these lines

<a href="/de/bands/thirty-seconds-to-mars/" class="Schrift_22">THIRTY SECONDS TO MARS //</a>
<a href="/de/bands/dropkick-murphys/" class="Schrift_20_dunkel">DROPKICK MURPHYS //</a>

With 3G I only get thirty-seconds-to-mars...

Julian
  • 21
  • 2
  • At which point is the HTML source code breaking off exactly? – Pekka Apr 04 '11 at 08:55
  • 1
    Impossible to say, with out the HTML and your parsing code. And then it's probably difficult. Are you sure the site returns the same HTML in both cases? Your best option is to step through the code in debug mode and find the place where your code fails. BTW do you have permission from the site owner to display his data in your app? – RoToRa Apr 04 '11 at 09:07
  • Yeah, I have the permissions. In Germany it' s not explicit forbidden to use the contents of public domain websites. And by the way, it is only for my private use to learn developing android-apps... Hm. my emulator, which emulates 3G, presents the right result. I guess in both cases it is the same HTML. – Julian Apr 04 '11 at 09:12
  • I don't think there is such a thing as a "public domain website" in Germany (legally speaking) unless the owner specificly puts it under a licence such as Creative Commons. And in software development, be sure or don't be sure, there is no guessing. – RoToRa Apr 05 '11 at 13:18

2 Answers2

1

Ah, I solved it. I searched, as it can be seen above, with this code-snippet

while ((line=reader.readLine()) != null)
{
   if (line.contains(searchPattern))
       al.add(line); //al is an ArrayList
}

With WLan (and my emulator) I really have a new line for each band e. g.:

line1
line2
line3
....

But with Edge or HDSPA all lines I get with Wlan are written in one line.

line1line2line3.... And with my regex i delte all before and after the line when I find a desired target. Hope you understand, it' s difficult to explain it in a foreign language.

A simple

while (line.contains(searchPattern))

fixed it.

Julian
  • 21
  • 2
0

You can always try reading whole http response before sending it for parsing. This way you get to see whole document is loaded properly.

harism
  • 6,011
  • 1
  • 36
  • 31