Friday, July 7, 2017

Python's Pickles

Python's Pickles

Pickles in Python are tasty in the sense that they represent a Python object as a string of bytes. Many things can actually be done with those bytes. For instance, you can store them in a file or database, or transfer them over a network. 

The pickled representation of a Python object is called a pickle file. The pickled file thus can be used for different purposes, like storing results to be used by another Python program or writing backups. To get the original Python object, you simply unpickle that string of bytes.

To pickle in Python, we will be using the pickle module. As stated in the documentation:

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

The pickle module allows us to store almost any Python object directly to a file or string without the need to perform any conversions. What the pickle module actually performs is what's so called object serialization, that is, converting objects to and from strings of bytes. The object to be pickled will be serialized into a stream of bytes that can be written to a file, for instance, and restored at a later point.

Installing pickle

The pickle module actually comes already bundled with your Python installation. In order get a list of the installed modules, you can type the following command in the Python prompt: help('modules').

So all you need to do to work with the pickle module is to import pickle!

Creating a Pickle File

From this section onwards, we'll take a look at some examples of pickling to understand the concept better. Let's start by creating a pickle file from an object. Our object here will be the todo list we made in the Python's lists tutorial.

In order to pickle our list object (todo), we can do the following:

Notice that we have made an import pickle to be able to use the pickle module. We have also created a pickle file to store the pickled object in, namely todo.pickle. The dump function writes a pickled representation of todo to the open file object pickle_file. In other words, the dump function here has two arguments: the object to pickle, which is the todo list, and a file object where we want to write the pickle, which is todo.pickle.

Unpickling (Restoring) the Pickled Data

Say that we would like to unpickle (restore) the pickled data; in our case, this is the todo list. To do that, we can write the following script:

The above script will output the todo list items:

As mentioned in the documentation, the load(file) function does the following:

Read a string from the open file object file and interpret it as a pickle data stream, reconstructing and returning the original object hierarchy. This is equivalent to Unpickler(file).load().

Pickles as Strings

In the above section, we saw how we can write/load pickles to/from a file. This is not necessary, however. I mean that if we want to write/load pickles, we don't always need to deal with files—we can instead work with pickles as strings. We can thus do the following:

Notice that we have used the dumps (with an "s" at the end) function, which, according to the documentation:

Returns the pickled representation of the object as a string, instead of writing it to a file.

In order to restore the pickled data above, we can use the loads(string) function, as follows:

According to the documentation, what the loads function does is that it:

Reads a pickled object hierarchy from a string. Characters in the string past the pickled object’s representation are ignored.

Pickling More Than One Object

In the above examples, we have dealt with pickling and restoring (loading) only one object at a time. In this section, I'm going to show you how we can do that for more than one object. Say that we have the following objects:

If you would like to learn more about Python dictionaries and tuples, check the following tutorials:

We can simply pickle the above objects by running a series of dump functions, as follows:

This will pickle all the four objects in the pickle file pickled_file.pickle.

There is another wonderful way to write the above script using the Pickler class in the pickle module, as follows:

To restore (load) the original data, we can simply use more than one load function, as follows:

The output of the above script is:

As with the Pickler class, we can rewrite the above script using the Unpickler class in the pickle module, as follows:

Note that the variables have to be written and read in the same order to get the desired output. To avoid any issues here, we can use a dictionary to administer the data, as follows:

To restore (load) the data pickled in the above script, we can do the following:

Pickles and Pandas

Well, this seems an interesting combination. If you are wondering what Pandas are, you can learn more about them from the Introducing Pandas tutorial. The basic data structure of pandas is called DataFrame, a tabular data structure composed of ordered columns and rows.

Let's take an example of DataFrame from the Pandas tutorial:

In order to pickle our DataFrame, we can use the to_pickle() function, as follows:

To restore (load) the pickled DataFrame, we can use the read_pickle() function, as follows:

Putting what we have mentioned in this section all together, this is what the script that pickles and loads a pandas object looks like:

Conclusion

In this tutorial, I have covered an interesting module called pickle. We have seen how easily this module enables us to store Python objects for different purposes, such as using the object with another Python program, transferring the object across a network, saving the object for later use, etc. We can simply pickle the Python object, and unpickle (load) it when we want to restore the original object.

Don’t hesitate to see what we have available for sale and for study in the marketplace, and don't hesitate to ask any questions and provide your valuable feedback using the feed below.


No comments:

Post a Comment