Working with the file system in Elixir does not really differ from doing so using other popular programming languages. There are three modules to solve this task: IO
, File
, and Path
. They provide functions to open, create, modify, read and destroy files, expand paths, etc. There are, however, some interesting gotchas that you should be aware of.
In this article we will talk about working with the file system in Elixir while taking a look at some code examples.
The Path Module
The Path module, as the name suggests, is used to work with file system paths. The functions of this module always return UTF-8 encoded strings.
For instance, you can expand a path and then generate an absolute path easily:
Path.expand('./text.txt') |> Path.absname # => "f:/elixir/text.txt"
Note, by the way, that in Windows, backslashes are replaced with forward slashes automatically. The resulting path can be passed to the functions of the File
module, for example:
Path.expand('./text.txt') |> Path.absname |> File.write("new content!", [:write]) # => :ok
Here we are constructing a full path to the file and then writing some contents to it.
All in all, working with the Path
module is simple, and most of its functions do not interact with the file system. We will see some use cases for this module later in the article.
IO and File Modules
IO, as the name implies, is the module to work with input and output. For example, it provides such functions as puts
and inspect
. IO has a concept of devices, which can be either process identifiers (PID) or atoms. For instance, there are :stdio
and :stderr
generic devices (which are actually shortcuts). Devices in Elixir maintain their position, so subsequent read or write operations start from the place where the device was previously accessed.
The File module, in turn, allows us to access files as IO devices. Files are opened in binary mode by default; however, you might pass :utf8
as an option. Also when a filename is specified as a character list ('some_name.txt'
), it is always treated as UTF-8.
Now let's see some examples of using the modules mentioned above.
Opening and Reading Files With IO
The most common task is, of course, opening and reading files. To open a file, a function called open/2 can be used. It accepts a path to the file and an optional list of modes. For example, let's try to open a file for reading and writing:
{:ok, file} = File.open("test.txt", [:read, :write]) file |> IO.inspect # => #PID<0.72.0>
You may then read this file using the read/2 function from the IO
module as well:
{:ok, file} = File.open("test.txt", [:read, :write]) IO.read(file, :line) |> IO.inspect # => "test" IO.read(file, :line) |> IO.inspect # => :eof
Here we are reading the file line by line. Note the :eof
atom that means "end of file".
You can also pass :all
instead of :line
to read the whole file at once:
{:ok, file} = File.open("test.txt", [:read, :write]) IO.read(file, :all) |> IO.inspect # => "test" IO.read(file, :all) |> IO.inspect # => ""
In this case, :eof
won't be returned—instead, we get an empty string. Why? Well, because, as we said earlier, devices maintain their position, and we start reading from the previously accessed place.
There is also an open/3 function, which accepts a function as the third argument. After the passed function has finished its work, the file is closed automatically:
File.open "test.txt", [:read], fn(file) -> IO.read(file, :all) |> IO.inspect end
Reading Files With File Module
In the previous section I've shown how to use IO.read
in order to read files, but it appears that the File
module actually has a function with the same name:
File.read "test.txt" # => {:ok, "test"}
This function returns a tuple containing the result of the operation and a binary data object. In this example it contains "test", which is the contents of the file.
If the operation was unsuccessful, then the tuple will contain an :error
atom and the error's reason:
File.read("non_existent.txt") # => {:error, :enoent}
Here, :enoent
means that the file does not exist. There are some other reasons like :eacces
(has no permissions).
The returned tuple can be used in pattern matching to handle different outcomes:
case File.read("test.txt") do {:ok, body} -> IO.puts(body) {:error, reason} -> IO.puts("There was an error: #{reason}") end
In this example, we either print out the file's contents or display an error reason.
Another function to read files is called read!/1. If you have come from the Ruby world, you've probably guessed what it does. Basically, this function opens a file and returns its contents in the form of a string (not tuple!):
File.read!("test.txt") # => "test"
However, if something goes wrong and the file cannot be read, an error is raised instead:
File.read!("non_existent.txt") # => (File.Error) could not read file "non_existent.txt": no such file or directory
So, to be on the safe side, you can, for example, employ the exists?/1 function to check whether a file actually exists:
defmodule Example do def read_file(file) do if File.exists?(file) do File.read!(file) |> IO.inspect end end end Example.read_file("non_existent.txt")
Great, now we know how to read files. However, there is much more we can do, so let's proceed to the next section!
Writing to Files
To write something to a file, use the write/3 function. It accepts a path to a file, the contents, and an optional list of modes. If the file does not exist, it will be created automatically. If, however, it does exist, all its contents will be overwritten by default. To prevent this from happening, set the :append
mode:
File.write("new.txt", "update!", [:append]) |> IO.inspect # => :ok
In this case, the contents will be appended to the file and :ok
will be returned as a result. If something goes wrong, you'll get a tuple {:error, reason}
, just like with the read
function.
Also, there is a write! function that does pretty much the same, but raises an exception if the contents cannot be written. For example, we can write an Elixir program that creates a Ruby program that, in turn, prints "hello!":
File.write!("test.rb", "puts \"hello!\"")
Streaming Files
The files can indeed be pretty large, and when using the read
function you load all the contents into the memory. The good news is that files can be streamed quite easily:
File.open!("test.txt") |> IO.stream(:line) |> Enum.each(&IO.inspect/1)
In this example, we open a file, stream it line by line, and inspect each line. The result will look like this:
"test\n" "line 2\n" "line 3\n" "some other line...\n"
Note that the new line symbols are not removed automatically, so you may want to get rid of them using the String.replace/4 function.
It is a bit tedious to stream a file line by line as shown in the previous example. Instead, you can rely on the stream!/3 function, which accepts a path to the file and two optional arguments: a list of modes and a value explaining how a file should be read (the default value is :line
):
File.stream!("test.txt") |> Stream.map( &(String.replace(&1, "\n", "")) ) |> Enum.each(&IO.inspect/1)
In this piece of code we are streaming a file while removing newline characters and then printing out each line. File.stream!
is slower than File.read
, but we don't need to wait until all lines are available—we can start processing the contents right away. This is especially useful when you need to read a file from a remote location.
Let's take a look at a slightly more complex example. I'd like to stream a file with my Elixir script, remove newline characters, and display each line with a line number next to it:
File.stream!("test.exs") |> Stream.map( &(String.replace(&1, "\n", "")) ) |> Stream.with_index |> Enum.each( fn({contents, line_num}) -> IO.puts "#{line_num + 1} #{contents}" end)
Stream.with_index/2 accepts an enumerable and returns a collection of tuples, where each tuple contains a value and its index. Next, we just iterate over this collection and print out the line number and the line itself. As a result, you'll see the same code with line numbers:
1 File.stream!("test.exs") |> 2 Stream.map( &(String.replace(&1, "\n", "")) ) |> 3 Stream.with_index |> 4 Enum.each( fn({contents, line_num}) -> 5 IO.puts "#{line_num + 1} #{contents}" 6 end)
Moving and Removing Files
Now let's also briefly cover how to manipulate files—specifically, move and remove them. The functions we're interested in are rename/2 and rm/1. I won't bore you by describing all the arguments they accept as you can read the documentation yourself, and there is absolutely nothing complex about them. Instead, let's take a look at some examples.
First, I'd like to code a function that takes all files from the current directory based on a condition and then moves them to another directory. The function should be called like this:
Copycat.transfer_to "texts", fn(file) -> Path.extname(file) == ".txt" end
So, here I want to grab all .txt files and move them to the texts directory. How can we solve this task? Well, firstly, let's define a module and a private function to prepare a destination directory:
defmodule Copycat do def transfer_to(dir, fun) do prepare_dir! dir end defp prepare_dir!(dir) do unless File.exists?(dir) do File.mkdir!(dir) end end end
mkdir!, as you've already guessed, tries to create a directory and returns an error if this operation fails.
Next, we need to grab all the files from the current directory. This can be done using the ls! function, which returns a list of file names:
File.ls!
Lastly, we need to filter the resulting list based on the provided function and rename each file, which effectively means moving it to another directory. Here is the final version of the program:
defmodule Copycat do def transfer_to(dir, fun) do prepare_dir!(dir) File.ls! |> Stream.filter( &( fun.(&1) ) ) |> Enum.each( &(File.rename(&1, "#{dir}/#{&1}")) ) end defp prepare_dir!(dir) do unless File.exists?(dir) do File.mkdir!(dir) end end end
Now let's see the rm
in action by coding a similar function that is going to remove all files based on a condition. The function will be called in the following way:
Copycat.remove_if fn(file) -> Path.extname(file) == ".csv" end
Here is the corresponding solution:
defmodule Copycat do def remove_if(fun) do File.ls! |> Stream.filter( &( fun.(&1) ) ) |> Enum.each( &File.rm!/1 ) end end
rm!/1 will raise an error if the file cannot be removed. As always, it has an rm/1 counterpart that will return a tuple with the error's reason if something goes wrong.
You may note that the remove_if
and transfer_to
functions are very similar. So why don't we remove code duplication as an exercise? I'll add yet another private function that takes all the files, filters them based on the provided condition, and then applies an operation to them:
defp filter_and_process_files(condition, operation) do File.ls! |> Stream.filter( &(condition.(&1)) ) |> Enum.each( &(operation.(&1)) ) end
Now simply utilize this function:
defmodule Copycat do def transfer_to(dir, fun) do prepare_dir!(dir) filter_and_process_files(fun, fn(file) -> File.rename(file, "#{dir}/#{file}") end) end def remove_if(fun) do filter_and_process_files(fun, fn(file) -> File.rm!(file) end) end # ... end
Third-Party Solutions
Elixir's community is growing, and fancy new libraries solving various tasks are emerging. The Awesome Elixir GitHub repo lists some popular solutions, and of course there is a section with libraries for working with files and directories. There are implementations for file uploading, monitoring, filename sanitization, and more.
For example, there is an interesting solution called Librex for converting your documents with the help of LibreOffice. To see it in action, you can create a new project:
$ mix new converter
Then add a new dependency to the mix.exs file:
defp deps do [{:librex, "~> 1.0"}] end
After that, run:
$ mix do deps.get, deps.compile
Next, you can include the library and perform conversions:
defmodule Converter do import Librex def convert_and_remove(dir) do convert "some_path/file.odt", "other_path/1.pdf" end end
In order for this to work, the LibreOffice executable (soffice.exe) must be present in the PATH
. Otherwise, you'll need to provide a path to this file as a third argument:
defmodule Converter do import Librex def convert_and_remove(dir) do convert "some_path/file.odt", "other_path/1.pdf", "path/soffice" end end
Conclusion
That's all for today! In this article, we've seen the IO
, File
and Path
modules in action and discussed some useful functions like open
, read
, write
, and others.
There are lots of other functions available for use, so be sure to browse Elixir's documentation. Also, there is an introductory tutorial on the official website of the language that can come in handy as well.
I hope you enjoyed this article and now feel a bit more confident about working with the file system in Elixir. Thank you for staying with me, and until next time!
No comments:
Post a Comment