3

In PHP, you can use array syntax to access string indexes. The following program

<?php
$foo = "Hello";
echo $foo[0],"\n";
?>

echos out

H

However, if you access the first character of a zero length string

<?php
$bar = "";
$bar[0] = "test";
var_dump($bar);
?>

PHP turns your string into an array. The above code produces

array(1) {
    [0] =>
    string(4) "test"
}

i.e. my zero length string was cast to an array. Similar "accessing an undefined index of a string" examples don't produce this casting behavior.

$bar = " ";
$bar[1] = "test";
var_dump($bar);

Produces the string t. i.e. $bar remains a string, and is not converted into an array.

I get these sorts of unintuitive edge cases are inevitable when the language needs to infer and/or automatically cast variable for you, but does anyone know what's going on behind the scenes here?

i.e. What is happening at the C/C++ level in PHP to make this happen. Why does my variable get turned into an array.

PHP 5.6, if that matters.

Alana Storm
  • 164,128
  • 91
  • 395
  • 599
  • 2
    No you created an array by doing `$bar[0] = "test";` – RiggsFolly May 26 '16 at 20:55
  • @RiggsFolly if you use `$bar = " ";` a space, `$bar[0] = "test";` will show `t`. Why array is not created in this case? – u_mulder May 26 '16 at 20:57
  • 2
    @RiggsFolly No I didn't, I created a string when I said `$bar = ""`. PHP cast that string as an array when I tried to access an index in the string (`0`) that didn't exist. Try the script script `$bar = " "` (a single character string) – Alana Storm May 26 '16 at 20:58
  • What did you expect as result of your assignment? As you assigned value to a 0th element of array and got exactly that. – E_p May 26 '16 at 20:59
  • 2
    @E_p I would expect an error be raised, or that PHP infer I wanted to extend the string from zero length to one, and assign a character to the first index of the array, similar to how `$bar = " ";$bar[3] = "test";var_dump($bar);` works. Re: that last one, I would expect my programming language to handle this weird cast consistently. It doesn't, and I get that happens, but I'm curious as to why. – Alana Storm May 26 '16 at 21:04
  • A string in PHP is an array – RiggsFolly May 26 '16 at 21:05
  • PHP has all kinds of things you would not expect compared to compiled languages like c, c++. If think in this case php will check for the variable to be empty. if so then the assignment constructs a array. But therefor I need to check the php source code or find documentation about this specific issue – Romuald Villetet May 26 '16 at 21:08
  • 1
    @RomualdVilletet Or ask on the world's most popular programming Q&A site :) – Alana Storm May 26 '16 at 21:10
  • 1
    I'm not too great at C, but maybe [this](https://github.com/php/php-src/blob/91f5940329fede8a26b64e99d4d6d858fe8654cc/Zend/zend_execute.c#L1713) might have to do with it? – Don't Panic May 26 '16 at 21:12
  • @Don'tPanic seems like the place to look for a answer '$bar = 0.0; $bar[0] = "i"; var_dump($bar);' outputs a warning like WARNING Cannot use a scalar value as an array on line number 4 which is in the same method as you mentioned – Romuald Villetet May 26 '16 at 21:17
  • Even funnier: Start with a string variable containing one character, and then assign a multi-letter string to its zero-eth index: `$bar = "a"; $bar[0] = "test"; var_dump($bar);` - result: `string(1) "t"`. And of course, let's not forget, non-multibyte-safeness is around every corner in PHP: `$bar[0] = "Önly";` – CBroe May 26 '16 at 21:18
  • Additionally, this code: `$test = 'hello!'; $test[0] = 'world!'; var_dump($test);` yields `string 'wello!' (length=6)`. – mcon May 26 '16 at 21:18
  • 2
    There is no cast, obviously in the first example you are "accessing" a position of a defined string. In the second example you are "creating" an array element with index 0. Should `$s='hello'; $s=array(0=>'test');` trigger an error? Same thing. – AbraCadaver May 26 '16 at 21:19
  • @AbraCadaver You worded that a lot better than I did in my answer. – shamsup May 26 '16 at 21:39
  • @AbraCadaver I think we're at the point where comments have lost their effectiveness as a form of communication, but see "String Access and Modification by Character" in the manual. http://php.net/language.types.string. I (think I) understand what you're saying about the array syntax, but that doesn't explain the different behavior for the same scenario with non-zero length strings. – Alana Storm May 26 '16 at 22:23

4 Answers4

12

On the C level the variable is converted to an array when assignment is done by the using [] operator. Of course when it is a string, has a length of 0 and is not a unset type of call( eg. unset($test[0])).

case IS_STRING: {
                zval tmp;

                if (type != BP_VAR_UNSET && Z_STRLEN_P(container)==0) {
                    goto convert_to_array;
                }

https://github.com/php/php-src/blob/PHP-5.6.0/Zend/zend_execute.c#L1156

Same conversion happens for boolean false values.

case IS_BOOL:
            if (type != BP_VAR_UNSET && Z_LVAL_P(container)==0) {
                goto convert_to_array;
            }

Confirmed by using a test:

<?php
$bar = false;
$bar[0] = "test";
var_dump($bar);

Outputs:

array(1) { [0]=> string(4) "test" }

When using true:

<?php
$bar = true;
$bar[0] = "test";
var_dump($bar);

Outputs:

WARNING Cannot use a scalar value as an array on line number 3
bool(true)

https://github.com/php/php-src/blob/PHP-5.6.0/Zend/zend_execute.c#L1249

When the value is a type of bool and has a value of true the following code is executed:

case IS_BOOL:
            if (type != BP_VAR_UNSET && Z_LVAL_P(container)==0) {
                goto convert_to_array;
            }
            /* break missing intentionally */

        default:
            if (type == BP_VAR_UNSET) {
                zend_error(E_WARNING, "Cannot unset offset in a non-array variable");
                result->var.ptr_ptr = &EG(uninitialized_zval_ptr);
                PZVAL_LOCK(EG(uninitialized_zval_ptr));
            } else { // Gets here when boolean value equals true.
                zend_error(E_WARNING, "Cannot use a scalar value as an array");
                result->var.ptr_ptr = &EG(error_zval_ptr);
                PZVAL_LOCK(EG(error_zval_ptr));
            }
            break;

PHP version 5.6 uses ZEND version 2.6.0

Community
  • 1
  • 1
4

I suspect "" is being treated as unset and then being converted to an array. Generally "" != null != unset, however, php is a little whatty when it comes to that.

php > $a="test"; $a[0] = "yourmom"; var_dump( $a );
string(4) "yest"

php > $a=""; $a[0] = "yourmom"; var_dump( $a );
array(1) {
  [0]=>
  string(7) "yourmom"
}

php > var_dump((bool) "" == null);
bool(true)

php > var_dump((bool) $f == null);
PHP Notice:  Undefined variable: f in php shell code on line 1
PHP Stack trace:
PHP   1. {main}() php shell code:0

Notice: Undefined variable: f in php shell code on line 1

Call Stack:
  470.6157     225848   1. {main}() php shell code:0

bool(true)
Alex Barker
  • 4,316
  • 4
  • 28
  • 47
  • See the last example I just updated. Also checking type and using it as a different type will undoubtedly produce different behavior. If you check the type of $f it will be null even though it's never been set. – Alex Barker May 26 '16 at 21:06
  • not sure what that last example shows -- plus you're using `==` not `===`, meaning you're adding the `==` type coercion behavior into the mix. – Alana Storm May 26 '16 at 21:07
  • Your whole question is based on type conversion ;) – Alex Barker May 26 '16 at 21:08
  • I realize that :) but this is the only case I know of PHP casting a variable in place for you without asking for an explicit cast. When you try to `==` an int and a string, PHP will do some converting behind the scenes, but the variables are left in place. – Alana Storm May 26 '16 at 21:20
  • PHP will convert any var being used, Ex: "1" + 2 === 3; – Alex Barker May 26 '16 at 21:30
  • 1
    Where do you get that `null != unset`? Undefined variables, when used, throw an error but do have a default value of `null`. On top of that, `isset` will actually return false if you have a defined variable with the value `null` meaning that technically the variable "is not set". Also, `var_dump($test === null);` throws an error, but outputs true. – Jonathan Kuhn May 26 '16 at 21:34
  • I was making a general CS statement that unset is != NULL. When you allocate memory with malloc, that memory usually isn't initialized, hence the calloc implementation. Also it produces a notice, not an error. – Alex Barker May 26 '16 at 21:46
1

I tried to find where this would be happening in the PHP source. I have limited experience with PHP internals, and with C in general, so someone please correct me if I'm wrong.

I think this is happening in zend_fetch_dimension_address:

   if (EXPECTED(Z_TYPE_P(container) == IS_STRING)) {
        if (type != BP_VAR_UNSET && UNEXPECTED(Z_STRLEN_P(container) == 0)) {
            zval_ptr_dtor_nogc(container);
convert_to_array:
            ZVAL_NEW_ARR(container);
            zend_hash_init(Z_ARRVAL_P(container), 8, NULL, ZVAL_PTR_DTOR, 0);
            goto fetch_from_array;
        }

It looks like if the container is a zero length string, it converts it to an array before it does anything with it.

Don't Panic
  • 41,125
  • 10
  • 61
  • 80
0

The reason it is changed to an array when you use array syntax on an empty string is because index 0 is undefined, and doesn't have a type at that point. Here is a kind of case study.

<?php
$foo = "Hello"; // $foo[0] is a string "H"
echo $foo[0],"\n"; // H
$foo[0] = "same?"; // $foo[0] is still a string, "s" note that only the s is kept.
echo $foo,"\n"; // sello
echo $foo[0],"\n"; // s
$foo[1] = "b"; // $foo[1] is a string "b"
echo $foo,"\n"; // sbllo
$bar = ""; // nothing defined at position 0
$bar[0] = "t"; // array syntax creates an array with a string as the first index
var_dump($bar); // array(1) { [0] => string(1) "t" }
shamsup
  • 1,952
  • 14
  • 18
  • Thank you for your answer, however *and doesn't have a type at that point* isn't correct. If you're referring to the string, it does have a type: `$bar = "";echo gettype($bar),"\n";` If you're referring to the array, you're right that the index is undefined, however `$bar = "a";$bar[2] = "b";var_dump($bar);` does **not** cast $bar as an array, and `2` is undefined. – Alana Storm May 26 '16 at 21:19
  • I am referring to the item at index 0 if the string, like I said: "because index 0 is undefined, and doesn't have a type at that point". In your example you just gave, `$bar = "a"` sets bar as a string, which is internally stored as an array of bytes ([ref](http://php.net/manual/en/language.types.string.php#language.types.string.substr)). Because of this, I would *assume* that until a variable has an actual string value (ie bytes assigned to indices), using array syntax will allow you to store items not limited to bytes, becoming a normal array. – shamsup May 26 '16 at 21:31