2

I have a long running PHP daemon with a collection class that extends ArrayIterator. This holds a set of custom Column objects, typically less than 1000. Running it through the xdebug profiler I found my find method consuming about 35% of the cycles.

How can I internally iterate over the items in an optimized way?

class ColumnCollection extends \ArrayIterator
{
    public function find($name)
    {
        $return = null;
        $name = trim(strtolower($name));
        $this->rewind();
        while ($this->valid()) {
            /** @var Column $column */
            $column = $this->current();
            if (strtolower($column->name) === $name) {
                $return = $column;
                break;
            }
            $this->next();
        }
        $this->rewind();

        return $return;
    }
}
Matt S
  • 14,976
  • 6
  • 57
  • 76

2 Answers2

2

Your find() method apparently just returns the first Column object with the queried $name. In that case, it might make sense to index the Array by name, e.g. store the object by it's name as the key. Then your lookup becomes a O(1) call.

ArrayIterator implements ArrayAccess. This means you can add new items to your Collection like this:

$collection = new ColumnCollection;
$collection[$someCollectionObject->name] = $someCollectionObject;

and also retrieve them via the square bracket notation:

$someCollectionObject = $collection["foo"];

If you don't want to change your client code, you can simply override offsetSet in your ColumnCollection:

public function offsetSet($index, $newValue)
{
    if ($index === null && $newValue instanceof Column) {
        return parent::offsetSet($newValue->name, $newValue);
    }
    return parent::offsetSet($index, $newValue);
}

This way, doing $collection[] = $column would automatically add the $column by name. See http://codepad.org/egAchYpk for a demo.

If you use the append() method to add new elements, you just change it to:

public function append($newValue)
{
    parent::offsetSet($newValue->name, $newValue);
}

However, ArrayAccess is slower than native array access, so you might want to change your ColumnCollection to something like this:

class ColumnCollection implements IteratorAggregate 
{
    private $columns = []; // or SplObjectStorage

    public function add(Column $column) {
        $this->columns[$column->name] = $column;
    }

    public function find($name) {
        return isset($this->data[$name]) ? $this->data[$name] : null;
    }

    public function getIterator()
    {
        return new ArrayIterator($this->data);
    }
}
Gordon
  • 312,688
  • 75
  • 539
  • 559
1

I replaced the iterator method calls with a loop on a copy of the array. I presume this gives direct access to the internal storage since PHP implements copy-on-write. The native foreach is much faster than calling rewind(), valid(), current(), and next(). Also pre-calculating the strtolower on the Column object helped. This got performance down from 35% of cycles to 0.14%.

public function find($name)
{
    $return = null;
    $name = trim(strtolower($name));
    /** @var Column $column */
    foreach ($this->getArrayCopy() as $column) {
        if ($column->nameLower === $name) {
            $return = $column;
            break;
        }
    }

    return $return;
}

Also experimenting with @Gordon's suggestion of using an array keyed on name instead of using the internal storage. The above is working well for a simple drop-in replacement.

Matt S
  • 14,976
  • 6
  • 57
  • 76