Queues With PHPs SplQueue
As developers, we have times when we get to process a lot of data. Sometimes it’s a few kilobytes of data and we don’t need to worry about the performance because computers are so fast. When we’re looking at gigabytes or more of data then we need to be aware of the performance of our algorithms or we might not even be able to complete the process in our lifetime.
In this article, we’ll discuss the queue data structure and how to use queues in PHP.
What Are Data Structures?
Before we go too far into what a queue is, we need to have a small side discussion to discuss data structures.
A data structure is a way of organizing data inside a computer so we can use it effectively.
The overarching idea is that we need to be able to take data and use our code to optimize it so we can process it most efficiently. Depending on the data, there are lots of different ways it can be organized, but the same kinds of problems keep coming up over and over again, so we have a basic set of data structures we can use to solve those problems.
Each of these data structures has different performance characteristics. We’ll use what’s known as the Big O Notation to describe the performance of the data structure with different processes like adding elements, deleting elements, and searching for a specific element as the number of elements gets very large. It can range from O(1) indicating it takes the same amount of time regardless of how many elements exist or O(n) indicating it increases linearly as the number of elements increases up to O(n!) which is something we need to strive to prevent.
There are two major types of data structures.
The first is linear data structures where elements are arranged in a sequence. Examples of this include arrays, lists, stacks, and queues.
The second is non-linear data structures where a linear structure doesn’t work because of operational complexities. Examples include graphs and trees.
Queues In General
Queues are an example of a linear data structure that contains elements that are linked to each other. This is much like a linked list but with the major difference that queues operate in a First In First Out (or FIFO) method. This means that we can only add elements from one end and remove them from the other. This makes them ideal for keeping track of items that need to be processed. Because they’re linked to each other we must access data sequentially and random access is not possible.
Because this is an audio medium, I’m going to stress it’s spelled Q-U-E-U-E. It’s a weird word due to the fact the first letter does all the work and you need to remember the rest.
When we want to insert or delete items into the queue we do so at the end of the queue. Because we’re keeping track of the end of the queue adding an element is O(1). Deleting an element is generally only done at the beginning of the queue.
Because we need to look at each element, in turn, when we’re searching for an element the worst case will have to run through every element in the list so it’s O(n). This isn’t horrible but it’s not the best outcome for a search so if you’re doing a lot of searches, it’s best to look at alternative data structures like a tree.
Depending on your implementation of the queue your using elements are either pushed or enqueued to the queue and they popped or dequeued.
Some benefits of queues are that they do not require contiguous blocks of memory and therefore can help reduce memory fragmentation and they support efficient removal and addition of elements.
Queues In PHP
We must stress to never create your implementation of data structures as it wastes time that could be used to add helpful features to our software. When I was in a data structures class we would spend a week implementing each data structure and if we were lucky it would compile. Someone has already uploaded an implementation of all these data structures to GitHub if it isn’t included in PHP.
Thankfully PHP comes with implementation for queues in the SplQueue class. It provides the basic features we need to have inside a queue without too much extra. SplQueue is implemented using the SplDoublyLinkedList class.
The most important functions are `enqueue()` and `dequeue()` which push new elements into the queue and remove elements respectively. Because it’s a queue it doesn’t have the `shift()` and `unshift()` functions.
Ideally, we would just push elements into the queue and then remove them when we’re ready to process them but we may also need to iterate through the elements to see what’s in the queue. There are two ways to do this.
The first is using a `foreach` loop.
$queue = new SplQueue();
$queue->enqueue("Deanna");
$queue->enqueue("Jamir");
$queue->enqueue("Essence");
foreach ($queue as $value) {
echo $value, PHP_EOL;
}
The other is using some of the other public functions of `SplQueue`. We have the `rewind()` function to reset our iteration to the head of the queue, `valid()` to see if we’ve gone off the end of the queue, `current()` to access the current data, and `next()` to go forward through the queue.
$queue->rewind();
while($queue->valid()){
echo $queue->current(), PHP_EOL;
$queue->next();
}
Foreach is best for almost every case and it’s less code which is generally easier to read.
What we want to do is pop items off as they shouldn’t be needed again (that’s the whole point of the queue). To do that we’ll use the `count()` function to see if there are still elements in the queue.
while($queue->count()) {
$item = $queue->dequeue();
echo $item, PHP_EOL;
}
At the end of this process, the queue will be empty because we popped all the elements off.
Let’s work through an example of how we can create a tool to “loop” through all the files in a directory structure without using recursion.
To start we’ll create a queue and put our search directory into it. For this example, we’ll search the /etc directory.
$toCheck = new SplQueue();
$toCheck->enqueue("/etc");
Next, we’ll enter into a loop where we loop through until we’ve checked all the elements in the queue.
$toCheck = new SplQueue();
$toCheck->enqueue("/etc");
while($toCheck->count()) {
$currentPath = $toCheck->dequeue();
}
I always like to set some kind of a count check to prevent an infinite loop so we’ll add that as well.
$toCheck = new SplQueue();
$toCheck->enqueue("/etc");
$count = 0;
while($count < 10000 && $toCheck->count()) {
$currentPath = $toCheck->dequeue();
$count++;
}
Now, we need to add a check to see if the `$currentPath` is a directory. If it is we’ll get the contents of the directory and add it to the queue for later checks.
$toCheck = new SplQueue();
$toCheck->enqueue("/etc");
$count = 0;
while($count < 10000 && $toCheck->count()) {
$currentPath = $toCheck->dequeue();
if (is_dir($currentPath)) {
$subItems = glob("$currentPath/*");
foreach ($subItems as $newItem) {
$toCheck->enqueue($newItem);
}
} else {
}
$count++;
}
Finally, we can add our logic to process the file.
$toCheck = new SplQueue();
$toCheck->enqueue("/etc");
$count = 0;
while($count < 10000 && $toCheck->count()) {
$currentPath = $toCheck->dequeue();
if (is_dir($currentPath)) {
$subItems = glob("$currentPath/*");
foreach ($subItems as $newItem) {
$toCheck->enqueue($newItem);
}
} else {
// do our processing here
echo $currentPath, PHP_EOL;
}
$count++;
}
The benefit to this solution over recursion is that it’s easier to debug and we could serialize the queue, save it to disk, and restart it later.
What You Need to Know
– Queues are a tool for keeping a linear list of items
– Good performance inserting/deleting
– Poor performance searching
– Useful for keeping track of histories
The post Queues With PHPs SplQueue appeared first on php[architect].