0

I'd like to make a website where users can find recipes by ingredients. How do I check how much an array matches another in percent?

To be more specific:

Pizza 1 contains cheese, tomato, ham, pineapple and beef.
Pizza 2 contains cheese, tomato, bacon, chili and salad
Pizza 3 tomato, salad and pineapple

If I am looking for a pizza, that contains cheese and tomato, then 1 and 2 matches with 100%, while 3 matches 50%. If I am looking for a pizza, that contains tomato, all will match 100%. If I am looking for a pizza, that contains cheese, tomato and ham, then pizza 1 will match 100%, pizza 2 66,67% and pizza 3 33,33%. That'll be easy to make.

But what if I am looking for a pizza, that contains cheese, tomato, ham, bacon, beef and pineapple - that'll be more ingredients in my wishlist than any of the pizza contains. Pizza 1 would match most, pizza 3 would match less. But how much would each pizza match in percent? More important: how to code that in PHP? array_intersect()? array_diff()? An combination? Something else?

And what, if it gets more complicated: I want a pizza that contains cheese, ham and bacon, but no pineapple. How would I make such?

I imagine, I am having a few arrays for the pizzas:

$pizza[0] = ['cheese', 'tomato', 'ham', 'beef', 'pineapple'];
$pizza[1] = ['cheese', 'tomato', 'bacon', 'chili', 'salad'];
$pizza[2] = ['tomato', 'salad', 'pineapple'];

Additionally, I am having an $included_wish and an $excluded_wish:

$included_wish = ['cheese', 'ham', 'bacon'];
$excluded_wish = ['pineapple'];
mickmackusa
  • 43,625
  • 12
  • 83
  • 136
  • It can be more easily solved in mysql, see http://stackoverflow.com/questions/31214144/organize-database-multichoice-columns-by-cronjob – Mahdyfo Jul 09 '15 at 20:36

1 Answers1

0

There are a few moving parts here, but none of them are terribly difficult.

To filter out the pizzas with excluded_wish toppings, use array_intersect() -- if any exclusions are found, then the pizza is completely removed from subsequent processes.

To calculate the percent of satisfaction per pizza based on the included_wish array, again use array_intersect() and count how many wishes were found.

As an additional consideration to help sorting, I am implementing a "surplus topping" count which is derived from an array_diff() call.

Sorting should only be done after all of the filtering is finished. Use arrays of comparisons inside of a usort() call to ensure DESC sorting by percentage, then ASC sorting by surplus toppings.

Code: (Demo)

$pizzas = [
    ['cheese', 'tomato', 'ham', 'beef', 'pineapple'],
    ['bacon', 'ham', 'cheese'],
    ['cheese', 'tomato', 'bacon', 'chili', 'salad'],
    ['tomato', 'salad', 'pineapple'],
    ['tomato', 'beef', 'bacon'],
    ['cheese', 'tomato', 'bacon', 'chili', 'salad', 'anchovies'],
    ['tomato', 'pineapple', 'bacon', 'cheese', 'ham'],
];

$included_wish = ['cheese', 'ham', 'bacon'];
$excluded_wish = ['pineapple'];

$percentages = [];
$wishCount = count($included_wish);
foreach ($pizzas as $pizza) {
    if (array_intersect($excluded_wish, $pizza)) {
        continue;
    }
    $result[] = [
        'perc' => count(array_intersect($included_wish, $pizza)) / $wishCount,
        'additional' => count(array_diff($pizza, $included_wish)),
        'ingredients' => implode(', ', $pizza),
    ];
}

usort(
    $result,
    fn($a, $b) => [$b['perc'], $a['additional']] <=> [$a['perc'], $b['additional']]
);
foreach ($result as $row) {
    vprintf("wished: %.02f%%, surplus toppings: %d : %s\n", $row);
}

Output:

wished: 1.00%, surplus toppings: 0 : bacon, ham, cheese
wished: 0.67%, surplus toppings: 3 : cheese, tomato, bacon, chili, salad
wished: 0.67%, surplus toppings: 4 : cheese, tomato, bacon, chili, salad, anchovies
wished: 0.33%, surplus toppings: 2 : tomato, beef, bacon
mickmackusa
  • 43,625
  • 12
  • 83
  • 136