-1

My code below

old = """
B07K6VMVL5
B071XQ6H38
B0B7F6Q9BH
B082KTHRBT
B0B78CWZ91
B09T8TJ65B
B09K55Z433
"""
duplicate = """
B0B78CWZ91
B09T8TJ65B
B09K55Z433
"""
final = re.sub(r"\b{}\b".format(duplicate),"",old)
print(final)

The final always prints the old variable values.I want the duplicate values to be removed in the old variable

  • 2
    First of all, why not `old.replace(duplicate,'')`? Next, you need to `strip` the `duplicate` - `re.sub(r"\b{}\b".format(duplicate.strip()),"",old)`, or at least `rstrip` it as there is no word boundary between a newline and end of string. – Wiktor Stribiżew Oct 23 '22 at 14:14
  • To further spell out what @Wiktor is saying, the final `\b` does not match because there is no word boundary after the final newline. – tripleee Oct 23 '22 at 14:18
  • I have formatted the code as follows. `final = re.sub(r"{}".format(duplicate),"",old) print(final)` . Got the same old variable value. `old.replace(duplicate,'')` also prints old value only – Aravindh Arun Oct 23 '22 at 14:36
  • And now, do you have any issues? What are you actually trying to achieve? – Wiktor Stribiżew Oct 23 '22 at 14:37
  • Actually I need to get duplicate variable as an input from an user and checks in with old variable (which is a already stored data) to remove the duplicates. – Aravindh Arun Oct 23 '22 at 14:40
  • So does the top comment solve the problem? – Wiktor Stribiżew Oct 23 '22 at 16:09

2 Answers2

0

The block string should not start/end in a new line since it will introduce a \n character. Try with

old = """B07K6VMVL5
B071XQ6H38
B0B7F6Q9BH
B082KTHRBT
B0B78CWZ91 #    <-
B09T8TJ65B #    <-
B09K55Z433""" # <-

duplicate = """B0B78CWZ91
B09T8TJ65B
B09K55Z433"""

and the result will not equal to the old.

Output

B07K6VMVL5
B071XQ6H38
B0B7F6Q9BH
B082KTHRBT

Alternatively use the block string like this

"""\
B0B78CWZ91
B09T8TJ65B
B09K55Z433\
"""
cards
  • 3,936
  • 1
  • 7
  • 25
0

It seems you can use

final = re.sub(r"(?!\B\w){}(?<!\w\B)".format(re.escape(duplicate.strip())),"",old)

Note several things here:

  • duplicate.strip() - the whitespaces on both ends may prevent from matching, so strip() removes them from the duplicates
  • re.escape(...) - if there are special chars they are properly escaped with re.escape
  • (?!\B\w) and (?<!\w\B) are dynamic adaptive word boundaries. They provide proper matching at word boundaries if required.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563