Friday, January 13, 2017

Erlang and Elixir, Part 2: Data Types

Erlang and Elixir, Part 2: Data Types

Elixir has a wealth of data types available. The usual basic types integer, float, boolean, and string are here, but so are the atom / symbol, list, tuple, and anonymous functions. You'll learn all about them in this tutorial.

Before we get started: We will be running these examples in Elixir's interactive mode. Type iex in your terminal to enter interactive mode. For more information, read Part 1 of this guide. (Windows users run iex.bat --werl.)

Preface

Elixir is built with meta-programming in mind. All of it is built upon macros. If you are not familiar with macros, they are simply a single instruction which performs a particular task. Not very hard to get your head around maybe, but here are some real-life examples:

if here is a macro for the standard conditional if ... structure we all know. Elixir will compile the result from the macro for you internally.

Meta-Programming?

Fundamental to Elixir is the manipulation of quoted expressions

Meta-programming essentially means you can create code from code (programs have the ability to treat their own code as data essentially... a huge step indeed.) 

An Elixir program can be represented as its own data structure. 

For example, the building blocks of Elixir are represented by a tuple with three elements (more on tuples later). The function call sum(1,2,3) is defined as so:

You can retrieve the representation of any macro via usage of the quote macro:

Note: Additional to the quote macro is the unquote macro—you can read more on the topic in the Elixir documentation.

So we see the first element is the function name of :sum, the second is the keyword list (more on this later) which contains metadata (currently blank in the example), and the third is the arguments list. 

All macros are built using these data structures, so this means our code can have the innate ability to modify and recompile its own code. So your application can now write its own code, check for faults in code and fix them, or even scan a plugin being added and check the code inside and modify it on the fly. 

The implications for artificial intelligence are obviously huge—but first, in this guide we must continue with the basics and get a strong footing in the various data types used by Elixir before we begin delving deeper.

Basic Types 

Now that you are in the interactive console, let's look at Booleans

The standard behaviours of true and false are supported. To check the value, we use the function provided by Elixir, is_boolean.

Each type also has one of these predicate functions. For example, is_integer/1is_float/1, or is_number/1 all will check if an argument is an integer, a float, or either. 

Elixir always refers to functions this way, with a slash followed by a number to signify the number of arguments the function takes. So is_boolean/1 requires 1 argument, e.g. is_boolean(1).

Note: If you want to access help at any time in the interactive shell, just type h and you will be able to access information on how to use the Elixir shell. Also, you can find information on any of Elixir's operators or functions by using h is_integer/1 to receive documentation on is_integer/1.

Mathematically, we can also perform functions like so:

In Elixir, when division is performed on an integer, the return is always a float:

If we want to get the remainder of the division or do integer division, we can use the functions div/2 and rem/2 as so:

Note: Parentheses are not required for function calls.

You can also just enter any binary, octal, or hexadecimal numbers into iex.

Atoms (Symbols)

These are like constants, but their name is their own value. Some languages refer to this as symbols. Booleans true and false are also examples of symbols.

Tuples 

Similar to lists and defined by curly braces, they can contain any data like so:

Lists

Similar to tuples, you define a list with square brackets like so:

Two lists can be concatenated and subtracted also:

To return the start of a list or the end of the list, we use the hd and tl functions (short for head and tail).

What's the Difference Between Lists and Tuples?

Lists are stored in memory as linked lists, a value pair list that is iterated and accessed in a linear operation. So updating is fast as long as we are appending to the list, but if we are modifying inside the list, the operation will be slower.

Tuples, on the other hand, are stored contiguously in memory. This means that getting the tuple total size or accessing just one element is fast. But here, in comparison to the lists, appending to an existing tuple is slow and requires copying the whole tuple in memory internally.

Anonymous Functions

We can define a function by using the fn and end keywords.

Elixir's manual refers to Anonymous Functions as “first class citizens”. This means that you can pass arguments to other functions the same way as integers or strings can. 

So, in our example, we have declared the anonymous function in the variable myFunc to the is_function(myFunc) checking function, which correctly returned true. This means we have successfully created our function. 

We can also check the number of arguments of our myFunc function by calling is_function(myFunc, 2).

Note: Using a dot (.) between the variable and parenthesis is required to call an anonymous function.

Strings

We define Strings in Elixir between double quotes:

Interpolation is supported also with the # sign:

For concatenation, use <>.

To get the length of a string, use the String.length function:

To split a string based on a pattern, you can use the String.split method:

Keyword Lists

Commonly used in programming, a key-value pair of two data entries, essentially two item tuples, can be created like so where the key is an atom:

Elixir has a syntax for defining lists as [key: value]. We can then use any of the other operators available in Elixir such as ++ to append to the back or front of the list.

Keyword lists have three important points to remember:

  1. Keys must be atoms.
  2. Keys are ordered, as specified by the developer.
  3. Keys can be given more than once.

For database queries, the Ecto Library makes use of this when performing a query like so:

Elixir's if macro also incorporates this into its design:

In general, when a list is an argument to a function, the square brackets are optional.

Maps

Similar to Keyword lists, maps are there to help your key-value pair needs. They're created via the %{} syntax as so:

They're similar to keyword lists, but not identical—there are two differences:

  1. Maps allow any value as a key.
  2. Maps’ keys do not follow any ordering.

Note: When you set all the keys in map as atoms, the keyword syntax can be very convenient:

Matching

Maps are very good for pattern matching, unlike keyword lists. When a map is used in a pattern, it will always match on a subset of a given value.

A pattern will always match as long as the keys exist in the provided map. So to match all, we use an empty map.

To take maps further, the Map Module provides a powerful API to manipulate maps:

Nested Data Structures

Data often requires hierarchies and structuring. Elixir provides support for maps inside maps, or keyword lists inside maps and so on. Manipulation of this is made convenient using the put_in and update_in macros, which can be used as follows:

Let's say we have the following:

So now we have some data on the table to play with, and each keyword list of users has a map containing the name, age, and some other information. If we want to access John's age, we can do the following:

It happens we can also use this same syntax for updating the value:

RegEx

Elixir uses Erlang's :re module, which is based on the PCRE. You can find more information in the documentation.

To create a Regular expression in Elixir, use the Regex.compile method or the special short-hand forms ~r or ~R.

The Regex module has a plethora of useful functions that you can use to validate regex expressions, compile them, and escape correctly. Here are some examples:

You can also run a regex over a map and retrieve the results with the named_captures method:

Conclusion

Elixir is a fully featured meta-programming environment based on the usage of macros fundamentally. It can help to bring you as a developer to a new level of application design and data structuring thanks to its powerful lists, maps, and anonymous functions when used in conjunction with its meta-programming capabilities.

We can look at specific solutions in our approaches, which in non-macro languages may have taken many lines or classes of code to create, due to the powerful wealth offered to Elixir in the form of its adoption of DSL (Domain Specific Language) and Erlang's Modules. 

Key-pair value store manipulation, list and basic data manipulation are just a fraction of the full spectrum offered to developers. We'll cover all of this in more detail as we continue in the next parts of the series.


No comments:

Post a Comment