2

I have a string like this:

mystring="Clusterd695c_ROUGE1.csv"

When I use mystring.rstrip("_ROUGE1.csv"), I expect to return "Clusterd695c", but I get "Clusterd695". As if the last character "c" has not been seen.This only happens for character "c" and the other characters works true. Like this:

mystring="Clusterd695f_ROUGE1.csv"
mystring.rstrip("_ROUGE1.csv")

Then I get "Clusterd695f", that was expecting.

How can I correct this?

Mahsa
  • 581
  • 1
  • 9
  • 28

1 Answers1

4

This is a bit surprising to me but the issue is that rstrip, when provided a string, will treat that string as a set and remove characters from the end of the string until they do not belong to that set. Because there's a 'c' in the newly-created set (i.e., csv), it removes the 'c' at the end too, stopping at '5' because it doesn't belong to the string "_ROUGE1.csv".

One way to deal with this is to use a replace: mystring.replace("_ROUGE1.csv", ""); another option is to simply chop off the last len("_ROUGE1.csv") characters. One caveat with the replace approach is that it will replace that string anywhere in the string, so "_ROUGE1.csv_ROUGE1.csv".replace("_ROUGE1.csv", "") == ""

erip
  • 16,374
  • 11
  • 66
  • 121
  • 2
    The `strip()` functions always worked that way (i. .e considering its argument as character set), so I don't understand the surprise. In the given use case its simply the wrong function to call. – guidot Aug 12 '20 at 12:06
  • Yes, ```mystring.replace("_ROUGE1.csv","")``` solved the problem. – Mahsa Aug 12 '20 at 12:06
  • @guidot because to wit it doesn't work like this in any other language. :-) Automatically coercing to a (logical) set is definitely not explicit (per zen). The `help` entry for `str.strip` doesn't explicitly mention this behavior, either (at least in 3.6.10). Therefore I find it a bit surprising. – erip Aug 12 '20 at 15:04
  • @erip: Given, that there is no type character set, I find the description *If chars is given and not None, remove characters in chars instead* clear enough. The [help file](https://docs.python.org/3/library/stdtypes.html?highlight=strip#str.strip) is a bit better, however: *The chars argument is a string specifying the set of characters to be removed* and has an example very similar to that of the question. – guidot Aug 12 '20 at 15:22
  • There is type `set`, but we digress. :-) – erip Aug 12 '20 at 15:37