Wednesday, February 21, 2018

JSON Serialization With Golang

JSON Serialization With Golang

Overview

JSON is one of the most popular serialization formats. It is human readable, reasonably concise, and can be parsed easily by any web application using JavaScript. Go as a modern programming language has first-class support for JSON serialization in its standard library. 

But there are some nooks and crannies. In this tutorial you'll learn how to effectively serialize and deserialize arbitrary as well as structured data to/from JSON. You will also learn how to deal with advanced scenarios such as serialization enums.

The json Package

Go supports several serialization formats in the encoding package of its standard library. One of these is the popular JSON format. You serialize Golang values using the Marshal() function into a slice of bytes. You deserialize a slice of bytes into a Golang value using the Unmarshal() function. It's that simple. The following terms are equivalent in the context of this article:

  • Serialization/Encoding/Marshalling
  • Deserialization/Decoding/Unmarshalling

I prefer serialization because it reflects the fact that you convert a potentially hierarchical data structure to/from a stream of bytes.

Marshal

The Marshal() function can take anything, which in Go means the empty interface and return a slice of bytes and error. Here is the signature:

func Marshal(v interface{}) ([]byte, error)

If Marshal() fails to serialize the input value, it will return a non-nil error. Marshal() has some strict limitations (we'll see later how to overcome them with custom marshallers):

  • Map keys must be strings.
  • Map values must be types serializable by the json package.
  • The following types are not supported: Channel, complex, and function.
  • Cyclic data structures are not supported.
  • Pointers will be encoded (and later decoded) as the values they point to (or 'null' if the pointer is nil).

Unmarshal

The Unmarshal() function takes a byte slice that hopefully represents valid JSON and a destination interface, which is typically a pointer to a struct or basic type. It deserializes the JSON into the interface in a generic way. If the serialization failed, it will return an error. Here is the signature:

func Unmarshal(data []byte, v interface{}) error

Serializing Simple Types

You can easily serialize simple types like using the json package. The result will not be a full-fledged JSON object, but a simple string. Here the int 5 is serialized to the byte array [53], which corresponds to the string "5".

If you try to serialize unsupported types like a function, you'll get an error:

Serializing Arbitrary Data With Maps

The power of JSON is that it can represent arbitrary hierarchical data very well. The JSON package supports it and utilizes the generic empty interface (interface{}) to represent any JSON hierarchy. Here is an example of deserializing and later serializing a binary tree where each node has an int value and two branches, left and right, which may contain another node or be null.

The JSON null is equivalent to the Go nil. As you can see in the output, the json.Unmarshal() function successfully converted the JSON blob to a Go data structure consisting of a nested map of interfaces and preserved the value type as int. The json.Marshal() function successfully serialized the resulting nested object to the same JSON representation.

To traverse the generic maps of interfaces, you'll need to use type assertions. For example:

Serializing Structured Data

Working with structured data is often the better choice. Go provides excellent support for serializing JSON to/from structs via its struct tags. Let's create a struct that corresponds to our JSON tree and a smarter Dump() function that prints it:

This is great and much cleaner than the arbitrary JSON approach. But does it work? Not really. There is no error, but our tree object is not getting populated by the JSON.

The problem is that the Tree fields are private. JSON serialization works on public fields only. So we can make the struct fields public. The json package is smart enough to transparently convert the lowercase keys "value", "left", and "right" to their corresponding uppercase field names.

The json package will silently ignore unmapped fields in the JSON as well as private fields in your struct. But sometimes you may want to map specific keys in the JSON to a field with a different name in your struct. You can use struct tags for that. For example, suppose we add another field called "label" to the JSON, but we need to map it to a field called "Tag" in our struct. 

Here is the new JSON with the root node of the tree labeled as "root", serialized properly into the Tag field and printed in the output:

Writing a Custom Marshaller

You will often want to serialize objects that don't conform to the strict requirements of the Marshal() function. For example, you may want to serialize a map with int keys. In these cases, you can write a custom marshaller/unmarshaller by implementing the Marshaler and Unmarshaler interfaces.

A note about spelling: In Go, the convention is to name an interface with a single method by appending the "er" suffix to the method name. So, even though the more common spelling is "Marshaller" (with double L), the interface name is just "Marshaler" (single L).

Here are the Marshaler and Unmarshaler interfaces:

You must create a type when doing custom serialization, even if you want to serialize a built-in type or composition of built-in types like map[int]string. Here I define a type called IntStringMap and implement the Marshaler and Unmarshaler interfaces for this type.

The MarshalJSON() method creates a map[string]string, converts each of its own int keys to a string, and serializes the map with string keys using the standard json.Marshal() function.

The UnmarshalJSON() method does the exact opposite. It deserializes the data byte array into a map[string]string and then converts each string key to an int and populates itself.

Here is how to use it in a program:

Serializing Enums

Go enums can be pretty vexing to serialize. The idea to write an article about Go json serialization came out of a question a colleague asked me about how to serialize enums. Here is a Go enum. The constants Zero and One are equal to the ints 0 and 1.

While you may think it's an int, and in many respects it is, you can't serialize it directly. You must write a custom marshaler/unmarshaler. That's not a problem after the last section. The following MarshalJSON() and UnmarshalJSON() will serialize/deserialize the constants ZERO and ONE to/from the corresponding strings "Zero" and "One".

Let's try to embed this EnumType in a struct and serialize it. The main function creates an EnumContainer and initializes it with a name of "Uno" and a value of our enum constant ONE, which is equal to the int 1.

The expected output is "Uno : 1", but instead it's "Uno : 0". What happened? There is no bug in the marshal/unmarshal code. It turns out that you can't embed enums by value if you want to serialize them. You must embed a pointer to the enum. Here is a modified version where that works as expected:

Conclusion

Go provides many options for serializing and deserializing JSON. It's important to understand the ins and outs of the encoding/json package to take advantage of the power.

This tutorial put all the power in your hands, including how to serialize the elusive Go enums.

Go serialize some objects!


No comments:

Post a Comment