Wednesday, August 30, 2017

Polymorphism With Protocols in Elixir

Polymorphism With Protocols in Elixir

Polymorphism is an important concept in programming, and novice programmers usually learn about it during the first months of studying. Polymorphism basically means that you can apply a similar operation to entities of different types. For instance, the count/1 function can be applied both to a range and to a list:

How is that possible? In Elixir, polymorphism is achieved by using an interesting feature called a protocol, which acts like a contract. For each data type you wish to support, this protocol must be implemented.

All in all, this approach is not revolutionary, as it is found in other languages (like Ruby, for example). Still, protocols are really convenient, so in this article we will discuss how to define, implement and work with them while exploring some examples. Let's get started!

Brief Introduction to Protocols

So, as already mentioned above, a protocol has some generic code and relies on the specific data type to implement the logic. This is reasonable, because different data types may require different implementations. A data type can then dispatch on a protocol without worrying about its internals.

Elixir has a bunch of built-in protocols, including Enumerable, Collectable, Inspect, List.Chars, and String.Chars. Some of them will be discussed later in this article. You may implement any of these protocols in your custom module and get a bunch of functions for free. For instance, having implemented Enumerable, you'll get access to all the functions defined in the Enum module, which is quite cool.

If you have come from the wondrous Ruby world full of objects, classes, fairies and dragons, you'll have met a very similar concept of mixins. For example, if you ever need to make your objects comparable, simply mix a module with the corresponding name into the class. Then just implement a spaceship <=> method and all instances of the class will get all methods like > and < for free. This mechanism is somewhat similar to protocols in Elixir. Even if you have never met this concept before, believe me, it is not that complex. 

Okay, so first things first: the protocol must be defined, so let's see how it can be done in the next section.

Defining a Protocol

Defining a protocol does not involve any black magic—in fact, it is very similar to defining modules. Use defprotocol/2 to do it:

Inside the protocol's definition you place functions, just like with modules. The only difference is that these functions have no body. It means that the protocol only defines an interface, a blueprint that should be implemented by all the data types that wish to dispatch on this protocol:

In this example, a programmer needs to implement the my_func/1 function to successfully utilize MyProtocol.

If the protocol is not implemented, an error will be raised. Let's return to the example with the count/1 function defined inside the Enum module. Running the following code will end up with an error:

It means that the Integer does not implement the Enumerable protocol (what a surprise) and, therefore, we cannot count integers. But the protocol actually can be implemented, and this is easy to achieve.  

Implementing a Protocol

Protocols are implemented using the defimpl/3 macro. You specify which protocol to implement and for which type:

Now you can make your integers countable by partly implementing the Enumerable protocol:

We will discuss the Enumerable protocol in more detail later in the article and implement its other function as well.

As for the type (passed to the for), you may specify any built-in type, your own alias or a list of aliases:

 On top of that, you may say Any:

This will act like a fallback implementation, and an error will not be raised if the protocol is not implemented for some type. In order for this to work, set the @fallback_to_any attribute to true inside your protocol (otherwise the error will still be raised):

You can now utilize the protocol for any supported type:

A Note About Structs

The implementation for a protocol can be nested inside a module. If this module defines a struct, you don't even need to specify for when calling defimpl:

In this example, we define a new struct called Product and implement our demo protocol. Inside, simply pattern-match the title and price and then output a string.

Remember, however, that an implementation does have to be nested inside a module—it means that you can easily extend any module without accessing its source code.

Example: String.Chars Protocol

Okay, enough with abstract theory: let's have a look at some examples. I am sure you have employed the IO.puts/2 function quite extensively to output debugging info to the console when playing around with Elixir. Surely, we can output various built-in types easily:

But what happens if we try to output our Product struct created in the previous section? I will place the corresponding code inside the Main module because otherwise you'll get an error saying that the struct is not defined or accessed in the same scope:

Having run this code, you'll get an error:

Aha! It means that the puts function relies on the built-in String.Chars protocol. As long as it is not implemented for our Product, the error is being raised.

String.Chars is responsible for converting various structures to binaries, and the only function that you need to implement is to_string/1, as stated by the documentation. Why don't we implement it now?

Having this code in place, the program will output the following string:

Which means that everything is working just fine!

Example: Inspect Protocol

Another very common function is IO.inspect/2 to get information about a construct. There is also an inspect/2 function defined inside the Kernel module—it performs inspection according to the Inspect built-in protocol.

Our Product struct can be inspected right away, and you'll get some brief information about it:

It will return %Product{price: 5, title: "Test"}. But, once again, we can easily implement the Inspect protocol that requires only the inspect/2 function to be coded:

The second argument passed to this function is the list of options, but we are not interested in them.

Example: Enumerable Protocol

Now let's see a slightly more complex example while talking about the Enumerable protocol. This protocol is employed by the Enum module, which presents us with such convenient functions as each/2 and count/1 (without it, you would have to stick with plain old recursion).

Enumerable defines three functions that you have to flesh out in order to implement the protocol:

  • count/1 returns the enumerable's size.
  • member?/2 checks whether the enumerable contains an element.
  • reduce/3 applies a function to each element of the enumerable.

Having all those functions in place, you'll get access to all the goodies provided by the Enum module, which is a really good deal.

As an example, let's create a new struct called Zoo. It will have a title and a list of animals:

Each animal will also be represented by a struct:

Now let's instantiate a new zoo:

So we have a "Demo Zoo" with three animals: a tiger, a horse, and a deer. What I'd like to do now is add support for the count/1 function, which will be used like this:

Let's implement this functionality now!

Implementing the Count Function

What do we mean when saying "count my zoo"? It sounds a bit strange, but probably it means counting all the animals that live there, so the implementation of the underlying function will be quite simple:

All we do here is rely on the count/1 function while passing a list of animals to it (because this function supports lists out of the box). A very important thing to mention is that the count/1 function must return its result in the form of a tuple {:ok, result} as dictated by the docs. If you return only a number, an error  ** (CaseClauseError) no case clause matching will be raised.

That's pretty much it. You can now say Enum.count(my_zoo) inside the Main.run, and it should return 3 as a result. Good job!

Implementing Member? Function

The next function the protocol defines is the member?/2. It should return a tuple {:ok, boolean} as a result that says whether an enumerable (passed as the first argument) contains an element (the second argument).

I want this new function to say whether a particular animal lives in the zoo or not. Therefore, the implementation is pretty simple as well:

Once again, note that the function accepts two arguments: an enumerable and an element. Inside we simply rely on the member?/2 function to search for an animal in the list of all animals.

So now we run:

And this should return true as we indeed have such an animal in the list!

Implementing the Reduce Function

Things get a bit more complex with the reduce/3 function. It accepts the following arguments:

  • an enumerable to apply the function to
  • an accumulator to store the result
  • the actual reducer function to apply

What's interesting is that the accumulator actually contains a tuple with two values: a verb and a value: {verb, value}. The verb is an atom and may have one of the following three values:

  • :cont (continue)
  • :halt (terminate)
  • :suspend (temporarily suspend)

The resulting value returned by the reduce/3 function is also a tuple containing the state and a result. The state is also an atom and may have the following values: 

  • :done (processing is done, that's the final result)
  • :halted (processing was stopped because the accumulator contained the :halt verb)
  • :suspended (processing was suspended)

If the processing was suspended, we should return a function representing the current state of the processing.

All these requirements are nicely demonstrated by the implementation of the reduce/3 function for the lists (taken from the docs):

We can use this code as an example and code our own implementation for the Zoo struct:

In the last function clause, we take the head of the list containing all animals, apply the function to it, and then perform reduce against the tail. When there are no more animals left (the third clause), we return a tuple with the state of :done and the final result. The first clause returns a result if the processing was halted. The second clause returns a function if the :suspend verb was passed.

Now, for example, we can calculate the total age of all our animals easily:

Basically, now we have access to all the functions provided by the Enum module. Let's try to utilize join/2:

However, you'll get an error saying that the String.Chars protocol is not implemented for the Animal struct. This is happening because join tries to convert each element to a string, but cannot do it for the Animal. Therefore, let's also implement the String.Chars protocol now:

Now everything should work just fine. Also, you may try to run each/2 and display individual animals:

Once again, this works because we have implemented two protocols: Enumerable (for the Zoo) and String.Chars (for the Animal).

Conclusion

In this article, we have discussed how polymorphism is implemented in Elixir using protocols. You have learned how to define and implement protocols, as well as utilize built-in protocols: Enumerable, Inspect, and String.Chars.

As an exercise, you can try to empower our Zoo module with the Collectable protocol so that the Enum.into/2 function can be properly utilized. This protocol requires the implementation of only one function: into/2, which collects values and returns the result (note that it also has to support the :done, :halt and :cont verbs; the state should not be reported). Share your solution in the comments!

I hope you have enjoyed reading this article. If you have any questions left, don't hesitate to contact me. Thank you for the patience, and see you soon!


No comments:

Post a Comment