9

I am using an SplHeap to hold graph nodes of a tree with directed edges that will be traversed from the leaves to the root. For this, I precalculate the "fan-in" of nodes and put them into the heap so that I can always retrieve the node with the smallest fan-in (0) from it.

After visiting a node, I reduce the fan-in of its successor by 1. Then obviously, the heap needs to be recalculated because the successor is now in the wrong place there. I have tried recoverFromCorruption(), but it doesn't do anything and keeps the heap in the wrong order (node with larger fanIn stays in front of smaller fanIn).

As a workaround, I'm now creating a new heap after each visit, amounting to a full O(N*log(N)) sort each time.

It should be possible, however, to make up-heap operations on the changed heap entry until it's in the right position in O(log(N)).

The API for SplHeap doesn't mention an up-heap (or deletion of an arbitrary element - it could then be re-added). Can I somehow derive a class from SplHeap to do this or do I have to create a pure PHP heap from scratch?

EDIT: Code example:

class VoteGraph {
    private $nodes = array();

    private function calculateFanIn() { /* ... */ }

    // ...

    private function calculateWeights() {
        $this->calculateFanIn();
        $fnodes = new GraphNodeHeap(); // heap by fan-in ascending (leaves are first)

        foreach($this->nodes as $n) {
            // omitted: filter loops
            $fnodes->insert($n);
        }

        // traversal from leaves to root
        while($fnodes->valid()) {
            $node = $fnodes->extract(); // fetch a leaf from the heap
            $successor = $this->nodes[$node->successor];
            // omitted: actual job of traversal
            $successor->fanIn--; // will need to fix heap (sift up successor) because of this

            //$fnodes->recoverFromCorruption(); // doesn't work for what I want
            // workaround: rebuild $fnodes from scratch
            $fixedHeap = new GraphNodeHeap();
            foreach($fnodes as $e)
                $fixedHeap->insert($e);
            $fnodes = $fixedHeap;
        }
    }
}

class GraphNodeHeap extends SplHeap {
    public function compare($a, $b) {
        if($a->fanIn === $b->fanIn)
            return 0;
        else
            return $a->fanIn < $b->fanIn ? 1 : -1;
    }
}

Complete code also available: https://github.com/md2k7/civicracy/blob/master/civi-php/protected/components/VoteGraph.php#L73

EDIT 2:

$this->putNode(new GraphNode(4));
$this->putNode(new GraphNode(1, 2));
$this->putNode(new GraphNode(3, 2));
$this->putNode(new GraphNode(2, 4));

This means user 1 and user 3 are voting for user 2, and user 2 is voting for user 4, passing on 3 votes (2 received + his/her own). This is called delegative voting: my algorithm is passing on votes "from the bottom" (the leaves) where I already know how much weight (responsibility/representation/as you like it...) each user has.

cidermole
  • 5,662
  • 1
  • 15
  • 21
  • 3
    It would be good if you could create a simplified code example with your question that demonstrates the issue. Congrats for your first question btw. – hakre Nov 09 '12 at 09:49
  • I would need some example data otherwise I'm not sure what this is doing. I'm not so good with academical stuff like graphs and so on, but if I see the values it's often straight forward. Would also help me to learn some terms ;) – hakre Nov 09 '12 at 20:43
  • Instead of the current VoteGraph constructor, you could imagine the following example data being set there (see **EDIT 2** above, code formatting is broken in comments): This means user 1 and user 3 are voting for user 2, and user 2 is voting for user 4 (who isn't voting). This is called delegative voting: my algorithm is passing on votes "from the bottom" (the leaves) where I already know how much weight (responsibility/representation/as you like it...) each user has. – cidermole Nov 10 '12 at 22:33
  • 1
    I took a look through [spl_heap.c](https://github.com/php/php-src/blob/master/ext/spl/spl_heap.c) and I don't see anything that could be used(or abused) to efficiently re-heapify. I'm suprised...I don't think that's really an obscure heap feature. – goat Nov 11 '12 at 22:33
  • thinking about the problem more, i see the difficulties involved in finding the topic element to change its key. the search for it would be linear unless additional info was maintained internally, adding memory and time overhead. – goat Nov 13 '12 at 00:35

2 Answers2

1

Can't you just re-insert the updated node after you change the value?

$successor->fanIn--;
$fnodes->insert($updatedNode);

Doesn't insert force a re-sort of the Heap? That will be a lower order then creating a new one.

Andrew
  • 12,617
  • 1
  • 34
  • 48
  • you'll get duplicate entries. but, this is still a good idea because checking/filtering for dupes might be a reasonable compromise. – goat Nov 13 '12 at 17:05
1

I was solving very similar problem recently, it seems like SPL doesn't support updates. So

I had to write my own heap.

It isn't extra efficient, but it does what I need and it's much faster than sorting an array repeatedly... SPL heap is still much faster though...

here it is...

class heap
{
    public $members=array();

    // these two are just for statistics
    private $swaps=0; 
    private $recurs=array('lups'=>0, 'ldowns'=>0);

    public function insert($val){

        if(is_array($val) && empty($this->members)){ // because heapify is (in theory) more efficient
            foreach($val as $v){
                $this->members[]=$v;
            }
            $this->heapify();
        }else{
            $emptyPosition=count($this->members);  // count(members) gets index of first empty position, not last key
            $this->members[]=$val; // puts $val in
            $this->ladderup($emptyPosition);
        }
    }

    public function heapify(){
    /* in case all the heap is broken, we can always use this to repair it.
     It should be more efficient to fill $members randomly and "repair" it with heapify after,
     than inserting things one by one*/

        $start=max(0, floor( (count($this->members)-1)/2)); // find last parent
        for($i=$start;$i>=0;$i--){
            $this->ladderdown($i);
        }
    }

    private function ladderdown($index){
    // recursively sifts down $index
        $this->recurs['ldowns']++;

        /*
        indexes of children
        they are stored at  parent_position*2 and parent_position*2+1
        but becouse php uses null-based array indexing, we have to modify it a little
        */  
        $iA=$index*2+1; 
        $iB=$index*2+2;

        if($iA<count($this->members)){ // check if children exist
            if($iB<count($this->members)){
                if($this->compare($iA, $iB)>=0) $bigger=$iA; // if both exist, compare them, cause we want to swap with the bigger one ; I'm using ">=" here, that means if they're equal, left child is used
                else $bigger=$iB;
            }else{
                $bigger=$iA; // if only one children exists, use it
            }

            if($this->compare($bigger, $index)>0){ // not using ">=" here, there's no reason to swap them if they're same
                $this->swap($bigger, $index);
                $this->ladderdown($bigger); // continue with $bigger because that's the position, where the bigger member was before we swap()ped it 
            }
        }
    }

    private function ladderup($index){
    // sift-up, 
        $this->recurs['lups']++;

        $parent=max(0, floor( ($index-1)/2)); // find parent index; this way it actualy swaps one too many times: at the end of sift-up-ing swaps the root with itself
        if($this->compare($index, $parent)>0){
            $this->swap($index, $parent);
            $this->ladderup($parent);
        }
    }

    public function root(){
        if(count($this->members)){
            return $this->members[0];
        }
        return false;   
    }

    public function extract(){
    // removes and returns root member
        if(!count($this->members)) return false;

        $this->swap(0,count($this->members)-1); // swaps root with last member
        $result=array_pop($this->members); // removes last member (now root)
        $this->ladderdown(0); // root is on wrong position, sifts it down
        return $result;
    }

    public function update($index, $value){
        if($index<count($this->members)){
            $this->members[$index]=$value;
            $this->ladderup($index);
            $this->ladderdown($index);
        }
    }

    public function delete($index){
    // removes index from heap the same way as root is extracted
        $this->swap(count($this->members)-1, $index); // swaps index with last one
        array_pop($this->members);
        $this->ladderup($index);
        $this->ladderdown($index);
    }

    private function swap($iA, $iB){
    // swaps two members
        $this->swaps++;

        $swap=$this->members[$iA];
        $this->members[$iA]=$this->members[$iB];
        $this->members[$iB]=$swap;
    }

    private function compare($iA, $iB){
        $result=$this->members[$iA] - $this->members[$iB];
        return $result;
    }

    public function stats($text=""){
     // prints and resets statistics
        echo "STATS: $text... Sift-ups: ".$this->recurs['lups']." Sift-downs: ".$this->recurs['ldowns']." Swaps: ".$this->swaps." <br>";
        $this->recurs=array('lups'=>0, 'ldowns'=>0);
        $this->swaps=0;
    }
}

//here's how to use it...

$h=new heap;

for($i=0; $i<10000; $i++){
    $h->insert(rand(1,1000));
}
$h->stats("after inserting one-by-one");

while($biggest=$h->extract()); // note that $h->extract might return FALSE, but might return zero as well, if there was zero in the heap

$h->stats("after extracting all roots (like in heapsort)");

echo "Now, heap is empty. Let's try whole array at once <br>";

for($i=0; $i<10000; $i++){
    $a[]=rand(1,1000);
}
$h->insert($a); // inserting whole array here, so heap will use more efficient heapify()
$h->stats("after heapify");

echo "let's update two indexes<br>";

$h->update(1234,44444);// sure on top
$h->stats("after update");
$h->update(8888,40000);// second place
$h->stats("after update");

echo "extract biggest three indexes<br>";

echo $h->extract()." - this should be 44444<br>";
echo $h->extract()." - this should be 40000<br>";
echo $h->extract()." - this should be biggest number given by rand(1,1000)<br>";

$h->stats("after three extracts");

while($h->extract());
$h->stats("after extracting the rest");

and result is:

STATS: after inserting one-by-one... Sift-ups: 22651 Sift-downs: 0 Swaps: 12651
STATS: after extracting all roots (like in heapsort)... Sift-ups: 0 Sift-downs: 116737 Swaps: 116737
Now, heap is empty. Let's try whole array at once
STATS: after heapify... Sift-ups: 0 Sift-downs: 12396 Swaps: 7396
let's update two indexes
STATS: after update... Sift-ups: 11 Sift-downs: 1 Swaps: 10
STATS: after update... Sift-ups: 13 Sift-downs: 1 Swaps: 12
extract biggest three indexes
44444 - this should be 44444
40000 - this should be 40000
1000 - this should be biggest number given by rand(1,1000)
STATS: after three extracts... Sift-ups: 0 Sift-downs: 42 Swaps: 42
STATS: after extracting the rest... Sift-ups: 0 Sift-downs: 116652 Swaps: 116652

You will have to modify it a bit, but anyway, hope it helps..

enrey
  • 1,621
  • 1
  • 15
  • 29
  • oh, sorry, forgot to mention, !Check! it for bugs. I ran many tests with random data, but you can never be sure :) – enrey Dec 02 '12 at 23:02
  • Oups, there it is, I forgot one sift-up in delete(), crashed badly recently...... – enrey Jun 01 '13 at 11:00