There are two ways of executing the tasks in any program that you write. The tasks will either be executed one after the other sequentially or they will be executed in parallel without waiting for a previous task to complete. The former method of task execution is called synchronous execution and the latter is called asynchronous execution.
Sometimes, the tasks or instructions need to execute sequentially like when you are extracting the headings from a scraped webpage. The scraping of the webpage has to happen before any extraction takes place.
However, there are situations where you might want to execute tasks asynchronously. For example, lets say you want to extract the heading from 20 different webpages. Instead of waiting for the scraping and extraction of one page to complete before proceeding with the next one, you can run multiple requests in parallel without waiting for the first request to complete.
In this tutorial, we will learn how to perform multiple tasks in parallel in PHP by using the Spatie async library.
Setting Spatie Up on Windows
The Spatie async library actually provides an easy to use wrapper around PHP's PCNTL extension. However, the PCNTL extension is not available for Windows. This means that you can only use the library in a UNIX environment.
Luckily, it is easy to work around this issue by simply installing Linux on Windows with WSL. Don't worry—it sounds way more complicated than it actually is. All you need to do is execute the following command after running PowerShell or Windows Command Prompt in administrator mode.
wsl --install
The above command will install Ubuntu as the default Linux distribution which is fine for our purpose. Once the installation process has completed you can open Ubuntu from the Start menu. Provide a username and password. This new Linux account will be considered an administrator and allow you to run sudo
administrative commands.
I would recommend that you install Visual Studio Code if it isn't already installed. Inside Visual Studio Code, you should also consider installing the Remote WSL extension to make it easy for you to edit files located in WSL or the Windows filesystem without worrying about any cross-platform issues.
Now, you should run the following command while you are in Ubuntu environment.
code .
This will install a shim server that will make it possible for WSL and VSCode to communicate with each other. You will also need to install Composer to make it easier for you to install and update and libraries.
Once the development environment is set up, you can create a tasks directory while inside Ubuntu by running the following command:
mkdir tasks
Now run the change directory command to enter inside tasks.
cd tasks
Inside the tasks directory, we can finally install the spatie/asyc package by running the following command:
composer require spatie/async
Checking for Successful Installation
Lets say that you are using this library in an environment where the PHP PCNTL extension is not installed. In that case, the library will automatically execute the code synchronously as a fallback.
One way to check whether we are running the code in an environment that supports asynchronous processes is to use the isSupported()
method from the library, which returns a Boolean value. The return value would be true
if the code can run asynchronously.
Create a file called test.php inside the tasks directory and add the following code to it.
<?php require_once('vendor/autoload.php'); use Spatie\Async\Pool; $pool = Pool::create(); if($pool->isSupported()) { echo 'We can run asynchronous code!'; } else { echo 'Something is wrong!'; } ?>
If everything was set up properly, you should also get We can run asynchronous code! as output when running the above code.
Executing Requests in Parallel
The library uses the symfony/process
component to create and manage different child processes. Since the library can create multiple child processes, it is able to execute PHP scripts in parallel. This allows you to run multiple independent synchronous tasks in parallel and significantly reduce the times it takes to complete them all.
One thing you need to be aware of when running processes in parallel is to not spawn a lot of them at once. This can result in unexpected application crash.
Luckily, spatie/async takes care of this with some helper methods of the Pool
class. This method add()
can handle as many processes as you want by scheduling and running them optimally.
Different processes will take different amount of times to complete. It is ideal to wait for all the processes in a pool to finish before continuing further without accidentally killing any child process. This task is handled by the wait()
method.
Lets say you want to execute some other code after a particular child process had finished and triggered a success event. You can do so with the help of the then()
function.
We will now write some code that will create 10 different text files. For comparison, we will begin by writing the code so that it runs synchronously and later update it to run asynchronously.
Here is the synchronous code:
<?php for($i = 1; $i <= 10; $i++) { $file_name = "file_$i.txt"; $content = bin2hex(random_bytes(2048)); file_put_contents($file_name, $content); echo "Generated file: $file_name".PHP_EOL; } ?>
The above code gives the following output:
Generated file: file_1.txt Generated file: file_2.txt Generated file: file_3.txt Generated file: file_4.txt Generated file: file_5.txt Generated file: file_6.txt Generated file: file_7.txt Generated file: file_8.txt Generated file: file_9.txt Generated file: file_10.txt
The content of each file is just a random hexadecimal string that is 4096 bytes long. Here is an example:
841bda21ae704ecd05ad64ccb4fb029c6c6e8bc590eda828e2080d9f9f842c1f39883fd8e837325655184219ed92d3a9ca356b96c4a0edeb751d7270f8c1b3b949975ab9786289870a3f3cb7501..... and so on
We will now rewrite the code so that it runs asynchronously. Here is what it will look like:
<?php require_once('vendor/autoload.php'); use Spatie\Async\Pool; $pool = Pool::create(); for($i = 1; $i <= 10; $i++) { $pool->add(function() use ($i) { $file_name = "file_$i.txt"; $content = bin2hex(random_bytes(2048)); file_put_contents($file_name, $content); return $file_name; })->then(function ($file_name) { echo "Generated file: $file_name".PHP_EOL; }); } $pool->wait(); ?>
The above code will generate the following output:
Generated file: file_5.txt Generated file: file_6.txt Generated file: file_1.txt Generated file: file_9.txt Generated file: file_8.txt Generated file: file_2.txt Generated file: file_4.txt Generated file: file_3.txt Generated file: file_10.txt Generated file: file_7.txt
As you can see, the files are not being generated in a sequential order when we execute the code asynchronously. In other words, file_5.txt did not have to wait for file_1.txt to be generated. We output the name of the file inside the then()
function as soon as its success event gets triggered.
Another alternative to using the methods add()
and wait()
is to use the functions async()
and await()
. Our code will look like this with the use of these functions:
<?php require_once('vendor/autoload.php'); use Spatie\Async\Pool; $pool = Pool::create(); for($i = 1; $i <= 10; $i++) { $pool[] = async(function() use ($i) { $file_name = "file_$i.txt"; $content = bin2hex(random_bytes(2048)); file_put_contents($file_name, $content); return $file_name; })->then(function ($file_name) { echo "Generated file: $file_name".PHP_EOL; }); } await($pool); ?>
Using Event Listeners
In the previous section we created a lot of child processes and added them to our Pool
class to execute asynchronously. Different processes inside the pool run independently of each other. This means that we needed some way to figure out when a particular task has completed. The success event is triggered when a task has executed successfully. At this point we are free to execute some other piece of code by using the then()
function.
However, processes are not always going to execute successfully. In some cases, they will either fail or time out without completing the task at hand. You can handle the exceptions by providing a callback with the catch()
function and timeout by providing a callback with the timeout()
function.
Lets use all these concepts together to write some code that tests Collatz conjecture. The conjecture tells us that if an even number returns its half as the next term and an odd number return 3 times itself + 1 as the next term, you will ultimately end up on 1. For example, the sequence for 14 will be 14 > 7 > 22 > 11 > 34 > 17 > 52 > 26 > 13 > 40 > 20 > 10 > 5 > 16 > 8 > 4 > 2 > 1.
We will run ten iterations in our code where we will pick one random number with each pass. Since the conjecture only deals with positive numbers, we will throw an exception whenever the random number is less than 1. Here is our code:
<?php require_once('vendor/autoload.php'); use Spatie\Async\Pool; $pool = Pool::create(); for($i = 0; $i < 10; $i++) { $pool->add(function() use ($i) { $orig_num = $num = mt_rand(-10000, 100000); if($i == 0) { $orig_num = $num = 75128138247; } $count = 0; if($num < 1) { throw new Exception("Conjecture not applicable on $orig_num."); } while($num != 1) { if($num%2 == 0) { $num /= 2; } else { $num = 3*$num + 1; } $count++; } return [$orig_num, $count]; })->then(function ($output) { echo "".$output[0]." reduced to 1 in ". $output[1] ." steps.". PHP_EOL; })->catch(function($e) { echo "Caught Exception ". $e->getMessage() . PHP_EOL; })->timeout(function() { echo "Process took too long \n"; }); } ?> $pool->wait();
Since the conjecture states that every positive number will eventually become 1, our code will ultimately exit the while
loop and return the original number as well as the iterations it took to reach 1. We also throw an exception if the number is less than 1 because the conjecture only applies to positive numbers.
Try running the code a few times and you are sure to run into exceptions. Here is my output:
47443 reduced to 1 in 75 steps. 75128138247 reduced to 1 in 1228 steps. 44961 reduced to 1 in 62 steps. 28545 reduced to 1 in 59 steps. 53756 reduced to 1 in 246 steps. Caught Exception Conjecture not applicable on -8059. 39324 reduced to 1 in 106 steps. Caught Exception Conjecture not applicable on -7991. 97972 reduced to 1 in 190 steps. 71809 reduced to 1 in 94 steps.
You might have noticed that we passed a very large number during the first iteration of the loop. It took 1228 steps to reach 1. However, it was still fast enough to escape the timeout condition.
Pool Configuration Options
Lets say you are doing something where you either want results within a certain time or abandon the task at hand. For example, you only want to calculate the steps if it takes less than 0.01 seconds to complete them. How do you enforce that constraint?
This is where pool configuration options prove helpful. There are four useful methods available to you.
concurrency()
determines the maximum amount of processes which can run simultaneously. This is set to 20 by default.timeout()
determines how long a process is run inside the pool before it times out. The default value is 300 seconds.sleepTime()
determines how frequently the loop should check the status of a process. the default value is 50000 microseconds.autoload()
specifies the autoloader that should be used by different sub processes.
In our case, we will set the timeout value to 0.01 seconds. All we need to do is add the following line before creating our loop.
$pool->timeout(0.01);
If you re-run the code from previous section with this one modification, you will notice that some numbers are now timing out before reaching the value 1. In real life, you can use this option to end processes like reading the contents of a very large file if it takes too long .
Final Thoughts
We discussed a lot of concepts in this tutorial. We began by learning how parallel processing and running code asynchronously can help us do things faster. After that, we learned how to set up WSL in Windows in order to use the async library. One the setup was successful, we saw how to create multiple files with parallel processing.
Finally, we learned about different event listeners and how to use pool configuration options to make sure that our processes run under certain constraints. For practice, you should try figuring out how to run multiple processes in parallel in order to quickly edit images in PHP.
No comments:
Post a Comment