This answer was migrated from a deleted duplicate. Revised to make sense independent of context.
Assume the following sample data (named $items
and $select
instead of $arr1
and $arr2
for clarity):
// Source data: A multidimensional array with named keys
$items = [
['id' => 1, 'name' => 'Foo'],
['id' => 3, 'name' => 'Bar'],
['id' => 5, 'name' => 'Maz'],
['id' => 6, 'name' => 'Wut'],
];
// Filter values: A flat array of scalar values
$select = [1, 5, 6];
Then, how do we extract $items
with an id
that matches one of the values in $select
? And further, how do we do that in a manner that scales gracefully for larger datasets? Let's look at the possibilities and compare their weights.
1. Optimizing array_filter()
:
The answer using array_filter
certainly gets the job done. However, there's an in_array
function call made at each iteration. With small datasets, this is hardly an issue. With larger datasets, repeated function calls in an iteration can result in a significant performance hit. Then, for large loops, where possible it's good to "preprocess" data for a lighter operation that uses language constructs in place of the more expensive function calls.
How to avoid in_array()
in loops?
You can "enable" simple index lookups with array_flip($select)
, ie. by swapping keys and values, and then using isset
(language construct, not a function!): isset($select[$id])
. This performs significantly better than repetitions of in_array($id, $select)
for larger datasets; not only for lack of function call, but at each iteration, in_array
scans over the $select
array for matches (over and over). Optimized as follows:
$select = array_flip($select);
$selected_items = array_filter($items, function($item) use ($select) {
return isset($select[$item['id']]);
});
Or using an arrow function that includes the parent scope, ie. doesn't need the use
statement:
$select = array_flip($select);
$selected_items = array_filter($items, fn($item) => isset($select[$item['id']]));
2. Using Key Intersection
One elegant alternative to filtering is key intersection. First, we re-index the array by the desired lookup key using array_column()
, with null
for column key (returns full array instead of a specific column), and with id
for the new index key:
$items_by_id = array_column($items, null, 'id');
This gives you the same source array, but instead of being zero-indexed, it now uses the id
column's value for the index key. Then, we're an array_intersect_key
away from extracting the selection from the source array:
$selected_items = array_intersect_key($items_by_id, array_flip($select));
Here we flip the $select
to intersect keys. Note that array_intersect_key
performs better than approaches using array_intersect
. (Keys are simple!) Result as expected. See demo of this approach. Finally, here's a one-liner (formatted for easy reading) without the throw-away variable:
$selected_items = array_intersect_key(
array_column($items, null, 'id'),
array_flip($select)
);
N.B. The resulting array will retain the actual id
of the item for its index key; instead of the default zero-indexed keys. Keep that in mind if you cross-reference the selected items with your source array later on in your code; and perhaps index items by the proper ID from the beginning.
Comparing these approaches:
array_filter()
incurs 1 iteration of $items
with 1 (anonymous) function call per each array member; and then as many iterations of $select
as there are items, if in_array
is used to compare the current item's ID with each $select
member. (Use key lookups instead.)
The answer using array_search
in a foreach
loop suffers from the same weight, resulting in count($items)
times function calls --- and a whole lot of redundant rounds over the selection/filter array.
The array_key_intersect
method 1. iterates over $items
once (simple reindexing); 2. iterates over $select
once (key/value flip); and 3. iterates over the keys of each for an intersection. array_intersect_key
sorts both lists and then compares them in parallel, and as such is much more efficient than repeated array scans for each value. (This function exists specifically for getting intersections, ie. finding overlaps, after all.)
3. Good Old Foreach Loop
Of course a good old foreach
loop will also work perfectly fine. Again, using array_flip()
and isset()
index lookups, rather than in_array()
or array_search()
. As follows:
$select = array_flip($select);
$selected_items = [];
foreach($items as $key => $val) {
if (isset($select[$val['id']])) {
$selected_items[] = $items[$key];
}
}
I'd instinctively use this for large datasets (or long comparison lists) where "bare bones" performance is called for, going by "simpler is better". However, you likely won't see a big difference between this and the key intersection approach without massive data to process. (If someone has compared these methods for PHP 8.x, please share the benchmark results.)