Case insensitive replace

Question

What's the easiest way to do a case-insensitive string replacement in Python?

score 278 · Accepted Answer · edited Nov 25 '20 at 19:03

278

The string type doesn't support this. You're probably best off using the regular expression sub method with the re.IGNORECASE option.

>>> import re
>>> insensitive_hippo = re.compile(re.escape('hippo'), re.IGNORECASE)
>>> insensitive_hippo.sub('giraffe', 'I want a hIPpo for my birthday')
'I want a giraffe for my birthday'

edited Nov 25 '20 at 19:03

Brian Moeskau

20,103
8
71
73

answered May 28 '09 at 03:39

Blair Conrad

233,004
25
132
111

12

If you're only doing a single replace, or want to save lines of code, it's more efficient to use a single substitution with re.sub and the (?i) flag: re.sub('(?i)' + re.escape('hippo'), 'giraffe', 'I want a hIPpo for my birthday') – Chiara Coetzee Nov 24 '11 at 01:04
5

Why **re.escape** for a string of letters only? Thanks. – Eleno Nov 09 '14 at 18:34
12

@Elena, it's not needed for `'hippo'`, but would be useful if the to-replace value was passed into a function, so it's really more of a good example than anything else. – Blair Conrad Nov 09 '14 at 23:19
4

Besides having to `re.escape` your needle, there's another trap here which this answer fails to avoid, noted in http://stackoverflow.com/a/15831118/1709587: since `re.sub` processes escape sequences, as noted in https://docs.python.org/library/re.html#re.sub, you need to either escape all backslashes in your replacement string or use a lambda. – Mark Amery Jun 26 '16 at 16:27
1

This doesn't work for replacing `r'A\BC'` with `r'D\EF'` in `r'xxxA\BCxxxA\BCxxx')` - The correct answer is blow, the one from johv – stenci Feb 24 '21 at 16:01

score 109 · Answer 2 · answered May 28 '09 at 03:41

109

import re
pattern = re.compile("hello", re.IGNORECASE)
pattern.sub("bye", "hello HeLLo HELLO")
# 'bye bye bye'

answered May 28 '09 at 03:41

Unknown

45,913
27
138
182

37

Or one-liner: `re.sub('hello', 'bye', 'hello HeLLo HELLO', flags=re.IGNORECASE)` – Louis Yang Dec 14 '18 at 22:02
1

Note that `re.sub` only supports this flag since Python 2.7. – fuenfundachtzig Jan 18 '19 at 12:48
some times it return random strings – urek mazino Jul 04 '23 at 20:25

score 61 · Answer 3 · edited Jul 02 '18 at 20:35

61

In a single line:

import re
re.sub("(?i)hello","bye", "hello HeLLo HELLO") #'bye bye bye'
re.sub("(?i)he\.llo","bye", "he.llo He.LLo HE.LLO") #'bye bye bye'

Or, use the optional "flags" argument:

import re
re.sub("hello", "bye", "hello HeLLo HELLO", flags=re.I) #'bye bye bye'
re.sub("he\.llo", "bye", "he.llo He.LLo HE.LLO", flags=re.I) #'bye bye bye'

edited Jul 02 '18 at 20:35

Bill the Lizard

398,270
210
566
880

answered Mar 14 '12 at 20:14

viebel

19,372
10
49
83

score 23 · Answer 4 · edited Jun 03 '17 at 12:39

23

Continuing on bFloch's answer, this function will change not one, but all occurrences of old with new - in a case insensitive fashion.

def ireplace(old, new, text):
    idx = 0
    while idx < len(text):
        index_l = text.lower().find(old.lower(), idx)
        if index_l == -1:
            return text
        text = text[:index_l] + new + text[index_l + len(old):]
        idx = index_l + len(new) 
    return text

edited Jun 03 '17 at 12:39

Neeraj

137
1
2
9

answered Jan 23 '11 at 11:46

rsmoorthy

2,284
1
24
27

3

Very well done. Much better than regex; it handles all kinds of characters, whereas the regex is very fussy about anything non-alphanumeric. Preferred answer IMHO. – fyngyrz Jan 09 '17 at 19:01
All you have to do is escape the regex: the accepted answer is much shorter and easier to read than this. – Mad Physicist Oct 23 '17 at 19:45
Escape only works for matching, backslashes in the destination can mess things up still. – ideasman42 Oct 09 '19 at 03:36
Possibly the fastest method for a case-insensitive replace, tested against both using an arrayed string and using regex. – Eugene Jul 01 '21 at 00:56

score 10 · Answer 5 · edited Jun 26 '16 at 16:20

Like Blair Conrad says string.replace doesn't support this.

Use the regex re.sub, but remember to escape the replacement string first. Note that there's no flags-option in 2.6 for re.sub, so you'll have to use the embedded modifier '(?i)' (or a RE-object, see Blair Conrad's answer). Also, another pitfall is that sub will process backslash escapes in the replacement text, if a string is given. To avoid this one can instead pass in a lambda.

Here's a function:

import re
def ireplace(old, repl, text):
    return re.sub('(?i)'+re.escape(old), lambda m: repl, text)

>>> ireplace('hippo?', 'giraffe!?', 'You want a hiPPO?')
'You want a giraffe!?'
>>> ireplace(r'[binfolder]', r'C:\Temp\bin', r'[BinFolder]\test.exe')
'C:\\Temp\\bin\\test.exe'

score 7 · Answer 6 · answered Apr 16 '19 at 20:17

This function uses both the str.replace() and re.findall() functions. It will replace all occurences of pattern in string with repl in a case-insensitive way.

def replace_all(pattern, repl, string) -> str:
   occurences = re.findall(pattern, string, re.IGNORECASE)
   for occurence in occurences:
       string = string.replace(occurence, repl)
       return string

score 6 · Answer 7 · answered Jan 21 '11 at 14:09

6

This doesn't require RegularExp

def ireplace(old, new, text):
    """ 
    Replace case insensitive
    Raises ValueError if string not found
    """
    index_l = text.lower().index(old.lower())
    return text[:index_l] + new + text[index_l + len(old):]

answered Jan 21 '11 at 14:09

bFloch

137
2
1

3

Good one, however this does not change all occurrences of old with new, but only the first occurrence. – rsmoorthy Jan 23 '11 at 11:01
6

It's less readable than the regex version. No need to reinvent the wheel here. – Johannes Bittner Jan 30 '11 at 22:04
It would be interesting to do a performance comparison between this and the upvoted versions, it might be faster, which matters for some applications. Or it might be slower because it does more work in interpreted Python. – Chiara Coetzee Nov 24 '11 at 00:59

score 5 · Answer 8 · edited Feb 13 '23 at 17:29

An interesting observation about syntax details and options:

# Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
>>> import re
>>> old = "TREEROOT treeroot TREerOot"

>>> re.sub(r'(?i)treeroot', 'grassroot', old)
'grassroot grassroot grassroot'

>>> re.sub(r'treeroot', 'grassroot', old)
'TREEROOT grassroot TREerOot'

>>> re.sub(r'treeroot', 'grassroot', old, flags=re.I)
'grassroot grassroot grassroot'

>>> re.sub(r'treeroot', 'grassroot', old, re.I)
'TREEROOT grassroot TREerOot'

Using the (?i) prefix in the match expression or adding flags=re.I as a fourth argument will result in a case-insensitive match - however using just re.I as the fourth argument does not result in case-insensitive match.

For comparison:

>>> re.findall(r'treeroot', old, re.I)
['TREEROOT', 'treeroot', 'TREerOot']

>>> re.findall(r'treeroot', old)
['treeroot']

This does not provide an answer to the question. please [edit] your answer to ensure that it improves upon other answers already present in this question. — hongsy, Jan 20 '20 at 05:10
From the [re.sub docs](https://docs.python.org/3/library/re.html#re.sub) it 5 parameters: `re.sub(pattern, repl, string, count=0, flags=0)` which is why `flags=re.I` works but trying to pass it as a positional parameter fails, it's in the wrong position. — DavidP, Apr 28 '21 at 16:24

score 1 · Answer 9 · edited Nov 15 '21 at 03:51

I was having \t being converted to the escape sequences (scroll a bit down), so I noted that re.sub converts backslashed escaped characters to escape sequences.

To prevent that I wrote the following:

Replace case insensitive.

import re
    def ireplace(findtxt, replacetxt, data):
        return replacetxt.join(  re.compile(findtxt, flags=re.I).split(data)  )

Also, if you want it to replace with the escape characters, like the other answers here that are getting the special meaning bashslash characters converted to escape sequences, just decode your find and, or replace string. In Python 3, might have to do something like .decode("unicode_escape") # python3

findtxt = findtxt.decode('string_escape') # python2
replacetxt = replacetxt.decode('string_escape') # python2
data = ireplace(findtxt, replacetxt, data)

Tested in Python 2.7.8

score 1 · Answer 10 · edited Nov 15 '21 at 03:51

1

i='I want a hIPpo for my birthday'
key='hippo'
swp='giraffe'

o=(i.lower().split(key))
c=0
p=0
for w in o:
    o[c]=i[p:p+len(w)]
    p=p+len(key+w)
    c+=1
print(swp.join(o))

edited Nov 15 '21 at 03:51

Nimantha

6,405
6
28
69

answered Feb 16 '12 at 13:59

anddan

31
1

2

For learning: generally when you do a search and replace on a string, it's better to not have to turn it into an array first. That's why the first answer is probably the best. While it's using an external module, it's treating the string as one whole string. It's also a bit clearer what's happening in the process. – isaaclw Apr 10 '12 at 14:43
For learning: its very difficult for a developer with no context to read this code and decipher what its doing :) – Todd Jan 29 '19 at 23:05
Any code that has counter++ is bad in general. – LazerDance Aug 03 '22 at 09:39

Case insensitive replace

10 Answers10

Linked

Related