10

I'm writing a library in PHP 5.3, the bulk of which is a class with several static properties that is extended from by subclasses to allow zero-conf for child classes.

Anyway, here's a sample to illustrate the peculiarity I have found:

<?php

class A {
    protected static $a;
    public static function out() { var_dump(static::$a); }
    public static function setup($v) { static::$a =& $v; }
}
class B extends A {}
class C extends A {}

A::setup('A');
A::out(); // 'A'
B::out(); // null
C::out(); // null

B::setup('B');
A::out(); // 'A'
B::out(); // 'B'
C::out(); // null

C::setup('C');
A::out(); // 'A'
B::out(); // 'B'
C::out(); // 'C'

?>

Now, this is pretty much desired behaviour for static inheritance as far as I'm concerned, however, changing static::$a =& $v; to static::$a = $v; (no reference) you get the behaviour I expected, that is:

'A'
'A'
'A'

'B'
'B'
'B'

'C'
'C'
'C'

Can anyone explain why this is? I can't understand how references effect static inheritance in any way :/

Update:

Based on Artefacto's answer, having the following method in the base class (in this instance, A) and calling it after the class declarations produces the behaviour labelled as 'desired' above without the need to assign by reference in setters, whilst leaving the results when using self:: as the 'expected' behaviour above.

/*...*/
public static function break_static_references() {
    $self = new ReflectionClass(get_called_class());
    foreach($self->getStaticProperties() as $var => $val)
        static::$$var =& $val;
}
/*...*/
A::break_static_references();
B::break_static_references();
C::break_static_references();
/*...*/
Community
  • 1
  • 1
connec
  • 7,231
  • 3
  • 23
  • 26
  • This is interesting; I have no idea. Unless someone comes up with an answer, you'll make me waste time investigating it :p – Artefacto Jul 06 '10 at 14:36
  • Probably has to do with 5.3's new late static binding – John Conde Jul 06 '10 at 14:48
  • Additionally, using get_called_class() instead of $v doesn't work as it can't be assigned by reference. However, using an intermediary variable for the reference works as above. – connec Jul 06 '10 at 14:50
  • @John Conde: The thing is the late static binding syntax (static::$a) is used for both scenarios, yet only the reference version produces what, considering the objectives of late static binding, I'd consider expected results. – connec Jul 06 '10 at 14:52
  • Out of curiosity, are B and C actual specializations of A or is A some sort of Basic God Object that all classes will inherit from, no matter if they are related or not. – Gordon Jul 06 '10 at 15:03
  • @Gordon: B and C are specializations of A. The actual scenario is an ActiveRecord-esque one, with A being the 'ActiveRecord' and B and C being specific models. – connec Jul 06 '10 at 15:09
  • OK, I know why this happens, now let me see if I can concoct a coherent answer :p – Artefacto Jul 06 '10 at 15:11
  • @Gordon I added a TL;DR version just for you :p BTW, you shouldn't have deleted your answer, your work-around was valuable. – Artefacto Jul 06 '10 at 15:37
  • @Artefacto `try { print 'Obrigado' } catch (LanguageException $e) { print 'Thanks '}` – Gordon Jul 06 '10 at 15:43
  • @Gordon No exception thrown here :p – Artefacto Jul 06 '10 at 15:46

1 Answers1

11

TL;DR version

The static property $a is a different symbol in each one of the classes, but it's actually the same variable in the sense that in $a = 1; $b = &$a;, $a and $b are the same variable (i.e., they're on the same reference set). When making a simple assignment ($b = $v;), the value of both symbols will change; when making an assignment by reference ($b = &$v;), only $b will be affected.

Original version

First thing, let's understand how static properties are 'inherited'. zend_do_inheritance iterates the superclass static properties calling inherit_static_prop:

zend_hash_apply_with_arguments(&parent_ce->default_static_members TSRMLS_CC,
    (apply_func_args_t)inherit_static_prop, 1, &ce->default_static_members);

The definition of which is:

static int inherit_static_prop(zval **p TSRMLS_DC, int num_args,
    va_list args, const zend_hash_key *key)
{
    HashTable *target = va_arg(args, HashTable*);

    if (!zend_hash_quick_exists(target, key->arKey, key->nKeyLength, key->h)) {
        SEPARATE_ZVAL_TO_MAKE_IS_REF(p);
        if (zend_hash_quick_add(target, key->arKey, key->nKeyLength, key->h, p,
                sizeof(zval*), NULL) == SUCCESS) {
            Z_ADDREF_PP(p);
        }
    }
    return ZEND_HASH_APPLY_KEEP;
}

Let's translate this. PHP uses copy on write, which means it will try to share the same actual memory representation (zval) of the values if they have the same content. inherit_static_prop is called for each one of the superclass static properties so that can be copied to the subclass. The implementation of inherit_static_prop ensures that the static properties of the subclass will be PHP references, whether or not the zval of the parent is shared (in particular, if the superclass has a reference, the child will share the zval, if it doesn't, the zval will be copied and new zval will be made a reference; the second case doesn't really interest us here).

So basically, when A, B and C are formed, $a will be a different symbol for each of those classes (i.e., each class has its properties hash table and each hash table has its own entry for $a), BUT the underlying zval will be the same AND it will be a reference.

You have something like:

A::$a -> zval_1 (ref, reference count 3);
B::$a -> zval_1 (ref, reference count 3);
C::$a -> zval_1 (ref, reference count 3);

Therefore, when you do a normal assignment

static::$a = $v;

since all three variables share the same zval and its a reference, all three variables will assume the value $v. It would be the same if you did:

$a = 1;
$b = &$a;
$a = 2; //both $a and $b are now 1

On the other hand, when you do

static::$a =& $v;

you will be breaking the reference set. Let's say you do it in class A. You now have:

//reference count is 2 and ref flag is set, but as soon as
//$v goes out of scope, reference count will be 1 and
//the reference flag will be cleared
A::$a -> zval_2 (ref, reference count 2);

B::$a -> zval_1 (ref, reference count 2);
C::$a -> zval_1 (ref, reference count 2);

The analogous would be

$a = 1;
$b = &$a;
$v = 3;
$b = &$v; //$a is 1, $b is 3

Work-around

As featured in Gordon's now deleted answer, the reference set between the properties of the three classes can also be broken by redeclaring the property in each one of the classes:

class B extends A { protected static $a; }
class C extends A { protected static $a; }

This is because the property will not be copied to the subclass from the superclass if it's redeclared (see the condition if (!zend_hash_quick_exists(target, key->arKey, key->nKeyLength, key->h)) in inherit_static_prop).

Artefacto
  • 96,375
  • 17
  • 202
  • 225
  • Wow, thanks for the precise explanation :) Nice to know how it works! – connec Jul 06 '10 at 15:30
  • 1
    @Minty looks like you're not the only one who think it's odd behavior: http://bugs.php.net/bug.php?id=51720 – Gordon Jul 06 '10 at 15:51
  • it's not deleted. It's hidden and only accessible for club members with 10k reputation :D – Gordon Jul 06 '10 at 16:05
  • Artefacto, thank you for the great explanation. Coming from C#, this behavior really threw me for a loop. – zomf May 29 '12 at 02:50