2

I'm trying to build a crawler in Golang. I'm using net/http library to download the html file from url. I'm trying to save http.resp and http.Header into file.

How to convert these two file from their respective format into string so that, it could be written to a text file.

I also see a question asked earlier on parsing a stored html response file. Parse HTTP requests and responses from text file in Go . Is there any way to save the url response in this format.

Community
  • 1
  • 1
L.fole
  • 687
  • 3
  • 12
  • 19
  • see also [`http.Response.Write`](https://golang.org/pkg/net/http/#Response.Write) method, which is used by [`httputil.DumpResponse`](https://golang.org/pkg/net/http/httputil/#DumpResponse) – JimB Jan 25 '16 at 15:22

3 Answers3

5

Go has an httputil package with a response dump. https://golang.org/pkg/net/http/httputil/#DumpResponse. The second argument of response dump is a bool of whether or not to include the body. So if you want to save just the header to a file, set that to false.

An example function that would dump the response to a file could be:

import (
    "io/ioutil"
    "net/http"
    "net/http/httputil"
)

func dumpResponse(resp *http.Response, filename string) error {
    dump, err := httputil.DumpResponse(resp, true)
    if err != nil {
        return err
    }

    return ioutil.WriteFile(filename, dump, 0644)
}
slcjordan
  • 96
  • 3
4

Edit: Thanks to @JimB for pointing to the http.Response.Write method which makes this a lot easier than I proposed in the beginning:

resp, err := http.Get("http://google.com/")

if err != nil{
    log.Panic(err)
}

f, err := os.Create("output.txt")
defer f.Close()

resp.Write(f)

This was my first Answer

You could do something like this:

resp, err := http.Get("http://google.com/")

body, err := ioutil.ReadAll(resp.Body)

// write whole the body
err = ioutil.WriteFile("body.txt", body, 0644)
if err != nil {
    panic(err)
}

This was the edit to my first answer:

Thanks to @Hector Correa who added the header part. Here is a more comprehensive snippet, targeting your whole question. This writes header followed by the body of the request to output.txt

//get the response
resp, err := http.Get("http://google.com/")

//body
body, err := ioutil.ReadAll(resp.Body)

//header
var header string
for h, v := range resp.Header {
    for _, v := range v {
        header += fmt.Sprintf("%s %s \n", h, v)
    }
}

//append all to one slice
var write []byte
write = append(write, []byte(header)...)
write = append(write, body...)

//write it to a file
err = ioutil.WriteFile("output.txt", write, 0644)
if err != nil {
    panic(err)
}
Riscie
  • 3,775
  • 1
  • 24
  • 31
  • 1
    Or you could get all the correct information using the [`http.Response.Write`](https://golang.org/pkg/net/http/#Response.Write) method, which is used by [`httputil.DumpResponse`](https://golang.org/pkg/net/http/httputil/#DumpResponse) – JimB Jan 25 '16 at 15:21
  • thanks @JimB, didn't know about this! Edited the snippet. – Riscie Jan 25 '16 at 15:40
2

Following on the answer by @Riscie you could also pick up the headers from the response with something like this:

for header, values := range resp.Header {
    for _, value := range values {
        log.Printf("\t\t %s %s", header, value)
    }
}
Hector Correa
  • 26,290
  • 8
  • 57
  • 73