7

s.difference(t) returns a new set with no elements in t.

s.difference_update(t) returns an updated set with no elements in t.

What's the difference between these two set methods? Because the difference_update updates set s, what precautions should be taken to avoid receiving a result of None from this method?

In terms of speed, shouldn't set.difference_update be faster since you're only removing elements from set s instead of creating a new set like in set.difference()?

Salvador Dali
  • 214,103
  • 147
  • 703
  • 753
  • Regarding the terminology: `s.difference_update(t)` doesn't **return** an updated set. It updates the set. This is an instruction, not an expression. – Aristide Dec 12 '14 at 08:06
  • `difference_update` doesn't return anything, so it will always be `None` if you take the return value. If everything is removed you will have `s` as an empty `set`. – Peter Wood Dec 12 '14 at 08:06
  • @Aristide: regarding the terminology, `s.difference_update(t)` *does* return something - the `None` object -, and it *is* an expression. Python's instructions are `import`, `def`, `class`, `for`, `while`, `with`, `break`, `continue`, `return`, `yield`, `try`, `except`, `finally`, `raise` and the assignment operator `=` (I probably forgot a couple but well, you get the idea). – bruno desthuilliers Dec 12 '14 at 08:23
  • @brunodesthuilliers Yes, you're right. I should have used the term "statement" instead of "instruction" (in french, my native language, "instruction" is used with both meanings). I'm aware that Python's statements have a None value (instead of having no value). My point was: "In most languages, statements contrast with expressions in that statements do not return results and are executed solely for their side effects, while expressions always return a result and often do not have side effects at all." (Wikipedia). – Aristide Dec 12 '14 at 08:39
  • @Aristide: `s.difference_update(t)` is still an expression- something that has a value which can be bound to a name. You can write `result=s.difference_update(t)` so it is an expression. And being French too, I used the term "instruction" as a synonym of "statement". I do understand the point you're trying to make and from a semantical POV `set.difference_update` is indeed used for side-effect and not it's return value, but technically it's still an expression. Python's statement don't "have a `None` value", they don't have any value at all, since trying to use them as RHS is a syntax error. – bruno desthuilliers Dec 12 '14 at 09:57

2 Answers2

13

Q. What's the difference between these two set methods?

A. The update version subtracts from an existing set, mutating it, and potentially leaving it smaller than it originally was. The non-update version produces a new set, leaving the originals unchanged.

Q. Because the difference_update updates set s, what precautions should be taken to avoid receiving a result of None from this method?

A. Mutating methods in Python generally return None as a way to indicate that they have mutated an object. The only "precaution" is to not assign the None result to a variable.

Q. In terms of speed, shouldn't set.difference_update be faster since you're only removing elements from set s instead of creating a new set like in set.difference()?

A. Yes, the algorithm of the update version simply discards values.

In contrast, the algorithm for the non-updating version depends on the size of the sets.

If the size of s is four or more times larger that t, the new set version first copies the main set and then discards values from it. So "s - t is implemented as n = s.copy(); n.difference_update(t)). That algorithm is used when s is much larger than t

Otherwise, the algorithm for the non-updating version is to create an empty new set n, loop over the elements of s and add them to n if they are not present in t.

Raymond Hettinger
  • 216,523
  • 63
  • 388
  • 485
10

difference_update updates the set in place rather than create a new one.

>>> s={1,2,3,4,5}
>>> t={3,5}
>>> s.difference(t)
{1, 2, 4}
>>> s
{1, 2, 3, 4, 5}
>>> s.difference_update(t)
>>> s
{1, 2, 4}
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152