0

I have a list and I would like to cache it into redis. I tried two ways to achieve it by using hashes.

Consider this first approach. I create only one hash and set items as hash values:

// ..

$apiArray = [..]; // array from parsing an api

if(!$c->keys('lista')){
    foreach (json_decode($apiArray) as $item){
        $c->hset('lista', $item->id, serialize($item));
    }
}

foreach ($c->hgetall('lista') as $key){

    $item = unserialize($key);

    echo '<p>';
    echo '<strong>id</strong>: '.$item->id.'<br>';
    echo '<strong>name</strong>: '.$item->name.'<br>';
    echo '<strong>email</strong>: '.$item->email.'<br>';
    echo '</p>';
}

To loop over 10000 items, it takes 0.5 seconds.

And now consider this one. A single hash on every element of the original array:

if(!$c->keys('lista:*')){
    foreach (json_decode($apiArray) as $item){
        $c->hset('lista:'.$item->id, 'element', serialize($item));
    }
}

foreach ($c->keys('lista:*') as $item) {
    $item = unserialize($c->hget($item, 'element'));

    echo '<p>';
    echo '<strong>id</strong>: '.$item->id.'<br>';
    echo '<strong>name</strong>: '.$item->name.'<br>';
    echo '<strong>email</strong>: '.$item->email.'<br>';
    echo '</p>';
}

The loop of 10000 records takes 3 seconds.

This is very surprising to me, because the second one is the approach covered in Redis official documentation, and it also supports the secondary indexing (by using zadd and sadd).

Why is slowest than the first approach? Am I wrong in something?

I think it might happen because I have to call 10000 times the hgetall() method to get items in the loop. Can you cofirm this?

Have I to prefer the first approach?

Thank you guys

M :)

Mauro
  • 189
  • 2
  • 14

2 Answers2

2

The reason the second block of code is slower is that it makes an hget call for each iteration of the loop. So every iteration makes a network round-trip to your Redis server.

By contrast, the first block of code doesn't make any network calls within the loop block. So it runs much faster.

Tague Griffith
  • 3,963
  • 2
  • 20
  • 24
  • thanks, thats make sense to me. But now the question is: is there a walkthrought to avoid to call hget (or scan) in the loop? – Mauro May 15 '17 at 17:11
  • 1
    Can you restructure the data so that instead of creating a hash for each element of the array, you just use a String value instead? Then you could fetch multiple items using an mget and save some round-trips. – Tague Griffith May 15 '17 at 18:04
  • Or use pipelining – Itamar Haber Jun 06 '17 at 23:06
1

It appears that you're interested in caching your entire list, writing and fetching it in bulk every time. In that case, what you can do is to just store the entire thing as JSON in a Redis String to obtain maximal performance.

Itamar Haber
  • 47,336
  • 7
  • 91
  • 117