39

I'm new to Go, and can't figure out how to use the compress/gzip package to my advantage. Basically, I just want to write something to a file, gzip it and read it directly from the zipped format through another script. I would really appreciate if someone could give me an example on how to do this.

Dave C
  • 7,729
  • 4
  • 49
  • 65
pymd
  • 4,021
  • 6
  • 26
  • 27

6 Answers6

61

All the compress packages implement the same interface. You would use something like this to compress:

var b bytes.Buffer
w := gzip.NewWriter(&b)
w.Write([]byte("hello, world\n"))
w.Close()

And this to unpack:

r, err := gzip.NewReader(&b)
io.Copy(os.Stdout, r)
r.Close()
laurent
  • 88,262
  • 77
  • 290
  • 428
11

Pretty much the same answer as Laurent, but with the file io:

import (
  "bytes"
  "compress/gzip"
  "io/ioutil"
)
// ...
var b bytes.Buffer
w := gzip.NewWriter(&b)
w.Write([]byte("hello, world\n"))
w.Close() // You must close this first to flush the bytes to the buffer.
err := ioutil.WriteFile("hello_world.txt.gz", b.Bytes(), 0666)
Community
  • 1
  • 1
Kevin Cantwell
  • 1,092
  • 14
  • 19
9

For the Read part, something like the useful ioutil.ReadFile for .gz files could be :

func ReadGzFile(filename string) ([]byte, error) {
    fi, err := os.Open(filename)
    if err != nil {
        return nil, err
    }
    defer fi.Close()

    fz, err := gzip.NewReader(fi)
    if err != nil {
        return nil, err
    }
    defer fz.Close()

    s, err := ioutil.ReadAll(fz)
    if err != nil {
        return nil, err
    }
    return s, nil   
}
sereizam
  • 2,048
  • 3
  • 20
  • 29
7

Here the func for unpack gzip file to destination file:

func UnpackGzipFile(gzFilePath, dstFilePath string) (int64, error) {
    gzFile, err := os.Open(gzFilePath)
    if err != nil {
        return 0, fmt.Errorf("open file %q to unpack: %w", gzFilePath, err)
    }
    dstFile, err := os.OpenFile(dstFilePath, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0660)
    if err != nil {
        return 0, fmt.Errorf("create destination file %q to unpack: %w", dstFilePath, err)
    }
    defer dstFile.Close()

    ioReader, ioWriter := io.Pipe()
    defer ioReader.Close()

    go func() { // goroutine leak is possible here
        gzReader, _ := gzip.NewReader(gzFile)
        // it is important to close the writer or reading from the other end of the
        // pipe or io.copy() will never finish
        defer func(){
            gzFile.Close()
            gzReader.Close()
            ioWriter.Close()
        }()

        io.Copy(ioWriter, gzReader)
    }()

    written, err := io.Copy(dstFile, ioReader)
    if err != nil {
        return 0, err // goroutine leak is possible here
    }

    return written, nil
}
Oleg Neumyvakin
  • 9,706
  • 3
  • 58
  • 62
  • I am trying to understand the solution, could you detail why we would need io.Pipe here? – Sairam Nov 21 '16 at 09:34
  • 1
    @Sairam io.Pipe allows to read source, gzip source bytes and write to destination on the fly, without allocating extra memory for source and excoded bytes. – Oleg Neumyvakin Nov 21 '16 at 09:52
2

I decided to combine ideas from others answers and just provide a full example program. Obviously there are many different ways to do the same thing. This is just one way:

package main

import (
    "compress/gzip"
    "fmt"
    "io/ioutil"
    "os"
)

var zipFile = "zipfile.gz"

func main() {
    writeZip()
    readZip()
}

func writeZip() {
    handle, err := openFile(zipFile)
    if err != nil {
        fmt.Println("[ERROR] Opening file:", err)
    }

    zipWriter, err := gzip.NewWriterLevel(handle, 9)
    if err != nil {
        fmt.Println("[ERROR] New gzip writer:", err)
    }
    numberOfBytesWritten, err := zipWriter.Write([]byte("Hello, World!\n"))
    if err != nil {
        fmt.Println("[ERROR] Writing:", err)
    }
    err = zipWriter.Close()
    if err != nil {
        fmt.Println("[ERROR] Closing zip writer:", err)
    }
    fmt.Println("[INFO] Number of bytes written:", numberOfBytesWritten)

    closeFile(handle)
}

func readZip() {
    handle, err := openFile(zipFile)
    if err != nil {
        fmt.Println("[ERROR] Opening file:", err)
    }

    zipReader, err := gzip.NewReader(handle)
    if err != nil {
        fmt.Println("[ERROR] New gzip reader:", err)
    }
    defer zipReader.Close()

    fileContents, err := ioutil.ReadAll(zipReader)
    if err != nil {
        fmt.Println("[ERROR] ReadAll:", err)
    }

    fmt.Printf("[INFO] Uncompressed contents: %s\n", fileContents)

    // ** Another way of reading the file **
    //
    // fileInfo, _ := handle.Stat()
    // fileContents := make([]byte, fileInfo.Size())
    // bytesRead, err := zipReader.Read(fileContents)
    // if err != nil {
    //     fmt.Println("[ERROR] Reading gzip file:", err)
    // }
    // fmt.Println("[INFO] Number of bytes read from the file:", bytesRead)

    closeFile(handle)
}

func openFile(fileToOpen string) (*os.File, error) {
    return os.OpenFile(fileToOpen, openFileOptions, openFilePermissions)
}

func closeFile(handle *os.File) {
    if handle == nil {
        return
    }

    err := handle.Close()
    if err != nil {
        fmt.Println("[ERROR] Closing file:", err)
    }
}

const openFileOptions int = os.O_CREATE | os.O_RDWR
const openFilePermissions os.FileMode = 0660

Having a full example like this should be helpful for future reference.

SunSparc
  • 1,812
  • 2
  • 23
  • 47
  • I don't think this is a good example; `openFile` should return the `error`; there is no point doing the `Stat` (also on what might be a nil value!); any errors from the `Write` and the `Close` are ignored; etc, etc. – Dave C Aug 14 '15 at 20:30
  • 1
    Yes, the error handling could be firmed up a bit, but it does seem to be a valid answer to the question being asked. @DaveC – Michael Whatcott Aug 14 '15 at 22:07
  • 1
    @DaveC, after reviewing the code and taking your suggestions into consideration, I did a bit of rewriting. It is just an example, but hopefully the edit has improved best practices. Thank you for the feedback. – SunSparc Aug 14 '15 at 23:41
1

To compress any Go object of interface type as input

func compress(obj interface{}) ([]byte, error) {
    var b bytes.Buffer
    objBytes, err := json.Marshal(obj)
    if err != nil {
        return nil, err
    }
    gz := gzip.NewWriter(&b)
    defer gz.Close() //NOT SUFFICIENT, DON'T DEFER WRITER OBJECTS
    if _, err := gz.Write(objBytes); err != nil {
        return nil, err
    }
    // NEED TO CLOSE EXPLICITLY
    if err := gz.Close(); err != nil {
        return nil, err
    }
    return b.Bytes(), nil
}

To decompress the same,

func decompress(obj []byte) ([]byte, error) {
    r, err := gzip.NewReader(bytes.NewReader(obj))
    if err != nil {
        return nil, err
    }
    defer r.Close()
    res, err := ioutil.ReadAll(r)
    if err != nil {
        return nil, err
    }
    return res, nil
}

Note, ioutil.ReadAll(r) returns io.EOF or io.ErrUnexpectedEOF if you do not close the Writer object after writing. I assumed defer on Close() would close the object properly, but it won't. Don't defer writer objects.

Riya John
  • 474
  • 4
  • 14
  • Can you explain what you mean by "DON'T DEFER WRITER OBJECTS"? I was receiving an "unexpected end of data" message from my outfiles while using `defer gz.Close()`. Moving that to the end fixed my problem. But I don't know why. – rideron89 Oct 12 '21 at 20:29
  • 1
    If you only defer a close on writer object, you are ignoring its return value which could be ignoring an error object. For more ref. https://www.joeshaw.org/dont-defer-close-on-writable-files/ – Riya John Oct 14 '21 at 07:51
  • 1
    My problem ended up being a bit of a novice's error, which I had answered [here](https://stackoverflow.com/questions/69556217/defer-gzip-close-doesnt-write-the-footer). But I'm glad you shared that article, thanks! – rideron89 Oct 14 '21 at 14:08