2

I found lots of discussions related to "shallow copy" in Python, but I cannot find my exact issue.

As per my understanding, creating a shallow copy still contains references to the original values of the list. This holds true in following case of a two-dimensional list.

>>> x = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> y = list(x)
>>> x.append(['New value'])
>>> x
[[1, 2, 3], [4, 5, 6], [7, 8, 9], ['New value']]
>>> y
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> x[0][0] = 'A'
>>> x
[['A', 2, 3], [4, 5, 6], [7, 8, 9], ['New value']]
>>> y
[['A', 2, 3], [4, 5, 6], [7, 8, 9]]

Now, in one-dimensional list, I didn't find this effect.

>>> a = [1,2,3,4,5]
>>> b = list(a)
>>> a.append(10)
>>> a
[1, 2, 3, 4, 5, 10]
>>> b
>>> [1, 2, 3, 4, 5]
>>> a[0] = 'A'
>>> a
['A', 2, 3, 4, 5, 10]
>>> b
[1, 2, 3, 4, 5]

Can anyone please clarify what is behind this difference?

Prune
  • 76,765
  • 14
  • 60
  • 81
itssubas
  • 163
  • 2
  • 11
  • 1
    @mVChr: No, that is absolutely not how Python works. Lists and ints are treated the same. – user2357112 Sep 19 '18 at 17:27
  • That's not even close to true. – mVChr Sep 19 '18 at 17:28
  • 2
    In the first case, you are mutating an object (the inner list) contained by your list. In the second case, you are mutating your list. So, try `x[0] = 'foo'` and you'll see the same behavior as your second case – juanpa.arrivillaga Sep 19 '18 at 17:48
  • 1
    You may find this article helpful: [Facts and myths about Python names and values](http://nedbatchelder.com/text/names.html), which was written by SO veteran Ned Batchelder. – PM 2Ring Sep 19 '18 at 19:45

3 Answers3

2

Shallow copy copies the top-level items, making a new instance of each. If there are any complex elements, the shallow copy will make a new copy of those items, but will not make new instances of their elements. References to nested lists will be new, but the second-level references will still be for the original objects.

Deep copy makes a new instance of each element at every level. One side effect is that this doubles the storage occupied by that item (now two items).

Here you can see the effect close-up. The shallow copy, b, has its own copy of each top-level item; when we change a[0], a scalar, the copy in b doesn't change. Then, although b[2] is in a location different from a[2], the pointer values are identical: they point to the same lower-level list. Thus, when we change a[2][1], that change is reflected in b[2][1].

>>> a = [1, 2, ['a', 'b', 'c'], 4, 5]
>>> b = list(a)
>>> a[0] = "new 1"
>>> a[2][1] = "Deeper"
>>> a
['new 1', 2, ['a', 'Deeper', 'c'], 4, 5]
>>> b
[1, 2, ['a', 'Deeper', 'c'], 4, 5]
Prune
  • 76,765
  • 14
  • 60
  • 81
0

Thiking with C pointers. You're list is many adresses which include themself a list. You're only copiing a new instance of the entire list, but if you modify the content on the address that is included in your list will affect this list. You're 3 three list inside x and inside y have the same address, but x and y have different addresses.

x == 0x0123  #Hexadecimal addresses
y == 0x0456
x = [0x01, 0x02, 0x03, whatever_you_want]
y = [0x01, 0x02, 0x03]
0x01 = [1, 2, 3]

When we just assign to list together, if instead of y = list(x) you wrote y = x, x and y will be changed at the same time because they will be pointer to the same address, like x = y = 0x0123.

I'm not totally sure about all, but it's how I can uderstand it with my actual knowledge in C

rSim
  • 344
  • 4
  • 17
-2

The lists in the list are references, so they share the same memory space even if you copy the outer list so that it doesn't. Notice that if you replace a list in the nested one it will only change in one and not the copy. Numbers aren't references, so when you copy a list they no longer share the same memory space

mVChr
  • 49,587
  • 11
  • 107
  • 104
  • 2
    [If you examine the memory addresses with `id`](https://ideone.com/rOzdEj) (which, as a CPython implementation detail, returns the memory address of an object), you will find that the same address is given for ints in the original list or in a copy. – user2357112 Sep 19 '18 at 17:31
  • 1
    Numbers are references, *all objects are references, and everything is an object* Python doesn't have a "prinitive types" that work differently than other objects. The semantics are always the same. – juanpa.arrivillaga Sep 19 '18 at 17:43
  • Oh my god you guys. This is what I mean: `a = 2; b = 2; id(a) >>> 40923456; id(b) >>> 40923456` - yes it's a reference but the reference is always the same. Every time you make a new list it's a new reference (i.e. try the above with `[]` instead of `2`), but if you have two variables pointing to the same list if you modify an element of one you modify the element of the other. – mVChr Sep 19 '18 at 17:52
  • 2
    No, the reference is *not* always the same. Python `int` objects work the exact same way in this context as list objects. Except that `int` objects don't expose any way to mutate them. Your explanation is simply incorrect. You have simply discovered a CPython implementation detail that small ints are cached. But try with larger ones, like 1000 – juanpa.arrivillaga Sep 19 '18 at 17:55
  • 1
    [Here's an example with different addresses for equal ints](https://ideone.com/AL2mPc), just in case you try the 1000 example yourself and accidentally run into the *other* implementation detail (constant merging at bytecode compilation time) that would cause the experiment to give the same address. (For completeness, [here's](https://ideone.com/B4aFz1) the same example without avoiding constant merging.) – user2357112 Sep 19 '18 at 18:01