Mistakes in Go: JSON (un)marshalling

Published on — Filed under golang, protip

Disclaimer: The do's and dont's in this article strike me as obvious advice but, from time to time, one does run into these...

Suppose you want to decode two JSON documents and do something with them. Here's how it'd look (without error checking for the sake of brevity but you should always check your errors):

var alice Person
var bob Person

json.Unmarshal(aliceBytes, &alice)
doSomethingWithPerson(alice)

json.Unmarshal(bobBytes, &bob)
doSomethingWithPerson(bob)

Clean. One struct address per unmarshal, as the universal law mandates.

A not so unreasonable behavior

Now suppose, for whatever reason — cosmic rays, Mercury retrograde, the prohibitive cost of memory these days — you decided to write this instead:

var p Person

json.Unmarshal(aliceBytes, &p)
doSomethingWithPerson(p)

json.Unmarshal(bobBytes, &p)
doSomethingWithPerson(p)

Still seems correct, right? The answer lies somewhere between why the hell would you do that to maybe it'll do what you want.

Any information read into p will be overwritten on the second call to json.Unmarshal. The problem is the information not read into p by the second call.

If Person is this struct:

type Person struct {
  Name string `json:"name"`
  Age  int    `json:"age"`
}

... then, decoding from two JSON documents that look like this:

{"name": "alice", "age": 30}
{"name": "bob"}

... while reusing the same struct address &p, the age in the first document will "leak" to the second — age is not present in bob's JSON, so it won't be written to the age property.

A more familiar face

The famous myth that declaring variables outside of the loop is more performant than per-iteration allocation can cause someone to write this:

var p Person
people := []Person{}
for _, r := range jsonArray {
  if err := json.Unmarshal(r, &p); err == nil {
    people = append(people, p)
  }
}

Golang playground sample

Using the two documents above, Bob is now also 30 years old:

// fmt.Println(people)
[Person{Name: "alice", Age: 30} Person{Name: "bob", Age: 30}]

The lesson is simple: narrow down the scopes of your variables as much as possible and don't try to outsmart the compiler.

But it gets worse...

The insidious nature of pointers

Suppose now we have three JSON records, exactly like this, in this order:

{"name": "alice", "age": 30}
{"age": 10, "name": 3} /* Can't be converted to Person */
{"name": "bob"}

Let's go back to that reading loop:

jsonArray := [][]byte{
  []byte(aliceJSON),
  []byte(brokenJSON),
  []byte(bobJSON),
}
// ...
for _, r := range jsonArray {
  if err := json.Unmarshal(r, &p); err == nil {
    people = append(people, p)
  }
}
fmt.Println(people)

Golang playground sample

Care to take a wild guess at the output? How old is Bob going to be?

[Person{Name: "alice", Age: 30} Person{Name: "bob", Age: 10}]

That's right. Even though the second record is broken JSON and produces an error during the unmarshal, even though we handle the error and skip it, we still get the side effects of its partial unmarshal "leaking" onto the next run of that loop.

Note

The quotes around leak are intentional; the code is doing exactly what it should, this is a case of PBKAC.

Long-story-short