4

I have an io.Reader in Golang and I want to double-check that the size of its data is below a predetermined maximum before or while running io.Copy() to save it to disk using io.Writer.

Since the file data in io.Reader could theoretically be quite large, I want to minimize memory usage and processing here if avoidable.
I don't think there's a function that's like io.CopyLessThanOrEqualToThisManyBytesOrReturnError(), but I did notice that io.ReadFull() can do the opposite to return an error if not enough bytes are there to fill the provided buffer.

Does anyone have a solution to this?


EDIT:
To clarify, copying a fraction of the data is not OK. It either needs to fail if it's over the threshold, or work if it's under.

Ben Guild
  • 4,881
  • 7
  • 34
  • 60

2 Answers2

3

Since io.Reader interface not knows anything about size or length of underlying data, there is only one solution for this problem:

You may use one buffer with maximum size nMax (predetermined maximum)+1 and in every call to Your CopyLessThanOrEqualToThisManyBytesOrReturnError function, inside this function read input and buffer it, and check for this buffer length, if it is less than or equal to nMax then do io.Write, otherwise return error:

const nMax = 5 // your predetermined maximum

func CopyLessThanOrEqualToThisManyBytesOrReturnError(r io.Reader, w io.Writer) error {
    var buf = make([]byte, nMax+1)
    nRead, e := io.ReadFull(r, buf)
    if nRead > 0 && nRead <= nMax {
        w.Write(buf[:nRead])
        return nil
    }
    if nRead > nMax {
        return fmt.Errorf("there is more data")
    }
    return e
}

Like this working sample code unsing string:

package main

import (
    "fmt"
    "io"
    "os"
    "strings"
)

const nMax = 5 // your predetermined maximum

func CopyLessThanOrEqualToThisManyBytesOrReturnError(r io.Reader, w io.Writer) error {
    var buf = make([]byte, nMax+1)
    nRead, e := io.ReadFull(r, buf)
    if nRead > 0 && nRead <= nMax {
        w.Write(buf[:nRead])
        return nil
    }
    if nRead > nMax {
        return fmt.Errorf("there is more data")
    }
    return e
}

func main() {
    r := strings.NewReader("123456789")
    err := CopyLessThanOrEqualToThisManyBytesOrReturnError(r, os.Stdout)
    if err != nil {
        fmt.Println(err) // there is more data
    }

    r = strings.NewReader("123\n")
    err = CopyLessThanOrEqualToThisManyBytesOrReturnError(r, os.Stdout) // 123
    if err != nil {
        fmt.Println(err)
    }

    r = strings.NewReader("")
    err = CopyLessThanOrEqualToThisManyBytesOrReturnError(r, os.Stdout)
    if err != nil {
        fmt.Println(err) // EOF
    }
}

output:

there is more data
123
EOF

Working sample code, using files:

package main

import (
    "fmt"
    "io"
    "os"
)

const nMax = 5 // your predetermined maximum

func CopyLessThanOrEqualToThisManyBytesOrReturnError(r io.Reader, w io.Writer) error {
    var buf = make([]byte, nMax+1)
    nRead, e := io.ReadFull(r, buf)
    if nRead > 0 && nRead <= nMax {
        w.Write(buf[:nRead])
        return nil
    }
    if nRead > nMax {
        return fmt.Errorf("there is more data")
    }
    return e
}

func main() {
    r, err := os.Open("input.bin")
    if err != nil {
        panic(err)
    }
    defer r.Close()

    w, err := os.Create("output.bin")
    if err != nil {
        panic(err)
    }
    defer w.Close()

    err = CopyLessThanOrEqualToThisManyBytesOrReturnError(r, w)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println("Done.")
}

Working sample code, using []byte:

package main

import (
    "bytes"
    "fmt"
    "io"
)

const nMax = 5 // your predetermined maximum

func CopyLessThanOrEqualToThisManyBytesOrReturnError(r io.Reader, w io.Writer) error {
    var buf = make([]byte, nMax+1)
    nRead, e := io.ReadFull(r, buf)
    if nRead > 0 && nRead <= nMax {
        w.Write(buf[:nRead])
        return nil
    }
    if nRead > nMax {
        return fmt.Errorf("there is more data")
    }
    return e
}

func main() {
    bs := []byte{1, 2, 3, 4, 5}
    r := bytes.NewReader(bs)

    w := &bytes.Buffer{}

    err := CopyLessThanOrEqualToThisManyBytesOrReturnError(r, w)
    if err != nil {
        fmt.Println(err)
    }

    fmt.Println("Done.")
    fmt.Println(w.Bytes())
}

output:

Done.
[1 2 3 4 5]
  • So to use this with an integer byte length, you'd do `r := strings.NewReader(make([] byte, 10240));` ...? – Ben Guild Aug 20 '16 at 06:17
  • 1
    `m, e := r.Read(buf)` is not guaranteed to fill the buffer. There could be more data available that still fits within the maximum limit. Using `io.ReadFull` would be more suitable to fill the buffer as much as possible. – 1lann Aug 20 '16 at 06:26
  • @Amd Honestly, it'd be nice if you named your variables. :) ... The code's hard to understand with just letters `m` and `n` for example. I had to trace them all back to see what they do. – Ben Guild Aug 20 '16 at 07:44
  • Your examples are great, but considering it's basically the same code as @1lann in the end, I have to choose his solution given the named variables. Thanks for your help/efforts though! – Ben Guild Aug 20 '16 at 07:54
  • Actually, yeah you're right. Now that I look at it the other does two separate operations. I'll choose this solution. – Ben Guild Aug 20 '16 at 07:57
2

You can use io.CopyN: https://golang.org/pkg/io/#CopyN which will return a nil error if it successfully copies exactly N bytes, or else an io.EOF (or possibly another error) if less than N bytes.

For example:

func copyMax(dst io.Writer, src io.Reader, n int64) error
    _, err := io.CopyN(dst, src, n)
    if err != nil {
        // Don't care if there's less available
        return nil
    }

    nextByte := make([]byte, 1)
    nRead, _ := io.ReadFull(src, nextByte)
    if nRead > 0 {
        // There's too much data
        return errors.New("Too much data")
    }

    return nil
}
1lann
  • 647
  • 7
  • 11
  • Uhh, that does the opposite I think. The goal here is to read **up to** a predetermined maximum. It's OK if it's less, but not OK if it's over. It should abort and return an error in that case. – Ben Guild Aug 20 '16 at 05:09
  • Yes, it only reads up to N bytes. The function will return when it copies exactly N bytes. If there are less than N bytes available, it will copy as much as it can and return an error. – 1lann Aug 20 '16 at 05:11
  • OK, I fixed the question title because clearly people aren't reading it. Do you see the fact that I'm asking how to avoid reading **over** a certain number of bytes, not under? – Ben Guild Aug 20 '16 at 05:14
  • Thanks for your solution. It's a bummer to have to load the entire desired length + 1 into memory just for this, but at least it's temporary. – Ben Guild Aug 20 '16 at 07:55