34

I have a massive JSON array stored in a file ("file.json") I need to iterate through the array and do some operation on each element.

err = json.Unmarshal(dat, &all_data)

Causes an out of memory - I'm guessing because it loads everything into memory first.

Is there a way to stream the JSON element by element?

blackgreen
  • 34,072
  • 23
  • 111
  • 129
K2xL
  • 9,730
  • 18
  • 64
  • 101
  • 2
    the std lib does not provide anything like this yet, but i'ts coming soon: see https://go-review.googlesource.com/#/c/9073/, you could take a look at the implementation to get an idea how to parse your special json yourself – Adam Vincze Aug 03 '15 at 18:39
  • @AdamVincze: coming soon, as in go1.5 which is due any time now ;) (for the current development docs, you can always use "tip.golang.org" http://tip.golang.org/pkg/encoding/json/) – JimB Aug 03 '15 at 21:32
  • 1
    That was quick - https://github.com/golang/go/issues/12001 – Sridhar Aug 04 '15 at 09:15

2 Answers2

43

There is an example of this sort of thing in encoding/json documentation:

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "strings"
)

func main() {
    const jsonStream = `
                [
                    {"Name": "Ed", "Text": "Knock knock."},
                    {"Name": "Sam", "Text": "Who's there?"},
                    {"Name": "Ed", "Text": "Go fmt."},
                    {"Name": "Sam", "Text": "Go fmt who?"},
                    {"Name": "Ed", "Text": "Go fmt yourself!"}
                ]
            `
    type Message struct {
        Name, Text string
    }
    dec := json.NewDecoder(strings.NewReader(jsonStream))

    // read open bracket
    t, err := dec.Token()
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("%T: %v\n", t, t)

    // while the array contains values
    for dec.More() {
        var m Message
        // decode an array value (Message)
        err := dec.Decode(&m)
        if err != nil {
            log.Fatal(err)
        }

        fmt.Printf("%v: %v\n", m.Name, m.Text)
    }

    // read closing bracket
    t, err = dec.Token()
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("%T: %v\n", t, t)

}
blackgreen
  • 34,072
  • 23
  • 111
  • 129
ijt
  • 3,505
  • 1
  • 29
  • 40
4

So, as commenters suggested, you could use the streaming API of "encoding/json" for reading one string at a time:

r := ... // get some io.Reader (e.g. open the big array file)
d := json.NewDecoder(r)
// read "["
d.Token()
// read strings one by one
for d.More() {
    s, _ := d.Token()
    // do something with s which is the newly read string
    fmt.Printf("read %q\n", s)
}
// (optionally) read "]"
d.Token()

Note that for simplicity I've left error handling out which needs to be implemented.

Pablo Lalloni
  • 2,615
  • 19
  • 20