Golang http get request breaks on some but not all urls

Question

Right now I'm fetching urls from indiegogo as part of a side project using the basic get request template found [here][1]. I then translate the byte data into a string using

responseText, err:= ioutil.ReadAll(response.Body)
trueText:= string(responseText)

with appropriate error handling where needed

It works fine for repeated attempts at getting and some other urls of varying length(at least as large as the previous url and some longer than the next).

Strangely, when I attempt to get it breaks and throws a runtime error of

panic: runtime error: index out of range

and exits with a status of 2. I'm curious as to what the issue could be.

I know it isn't indiegogo getting angry about my once a minute requests and cutting my connection because I can request continiously for 20 minutes at with no issue. Give it a bit of downtime and it still completely breaks on

Thanks for the assistance

EDIT, it appears as though it was a malformed bit of html in some of the pages that messed with a loop I was running based on the content that managed to break go in the runtime on only some urls. Thanks for the help

[1]:

can you make sure "responseText" isn't a nil splice? Try printing its contents. — matthewbauer, Jul 23 '13 at 22:48
Ok, at printing the responseText I get a byte array, as expected, and it appears as though the conversion to string is what is breaking it. — Everlag, Jul 23 '13 at 22:53
There was a similar question a while back. http://stackoverflow.com/questions/14230145/what-is-the-best-way-to-convert-byte-array-to-string — matthewbauer, Jul 23 '13 at 22:57
I suspect that kickstarter page has a null character that is throwing the conversion off. — matthewbauer, Jul 23 '13 at 22:59
What version of go are you using? I am on go1.1.1 and it works just fine: http://play.golang.org/p/YNZtuWMEI- — creack, Jul 24 '13 at 01:02
It appears as though the specific page is the issue with what characters it has. I'm running go 1.1.1 Thanks for the help guys, I'll preparse the byte output of the request for null outputs. — Everlag, Jul 24 '13 at 01:07
I'm certain you don't want `ioutil.ReadAll` and if you do, you very likely do not want to convert that large contiguous byte slice to a string -- you're making two complete copies of whatever content happens to be returned from the upstream server and hoping it's a valid string. In almost every case I've ever seen, you're going to be doing some kind of incremental parsing with that result, so you're best off skipping all that middle part that leads to massive memory bloat/exploitability. — Dustin, Jul 24 '13 at 04:26

score 0 · Accepted Answer · answered Jul 24 '13 at 01:29

There is no error when getting from the url and converting the body to the Go string type. For example,

package main

import (
    "fmt"
    "io/ioutil"
    "log"
    "net/http"
)

func main() {
    url := "http://www.indiegogo.com/projects/culcharge-smallest-usb-charge-and-data-cable-for-iphone-and-android"
    res, err := http.Get(url)
    if err != nil {
        log.Fatal(err)
    }
    body, err := ioutil.ReadAll(res.Body)
    res.Body.Close()
    if err != nil {
        log.Fatal(err)
    }
    text := string(body)
    fmt.Println(len(body), len(text))
}

Output:

66363 66363

You didn't provide us with a small fragment of code which compiles, runs, and fails in the manner you describe. That leaves us all guessing.

Golang http get request breaks on some but not all urls

1 Answers1