-2

I am new to python and I wrote the following code which suppose to catch a specific string and replace it with a specific string as well. sid=\"1722407313768658\"

I used this regex: sid=(.+?) but it catches irrelevant string as well https://tmobile.demdex.net/dest5.html?d_nsid=0#

as well when I am running this regex on sid=\"1722407313768658\" (replacing it with 1900117189066752 , I am getting the following result which does not replace the string but add i: sid=\1900117189066752\ "1722407313768658\"

(instead of 1722407313768658 i want to have 1900117189066752 )

this is my python code:

 import re
                content = c.read()

                ################################################################

                # change sessionid in content
                replace_small_sid = str('sid=\\' + "\\"+str(sid) + "\\" + " ")
                content = re.sub("sid=(.+?)", replace_small_sid, content)
tupac shakur
  • 658
  • 1
  • 12
  • 29
  • If you are searching for a _specific_ string, you don't need regex. Just use `str.replace`. Regex is designed for searching for _patterns_ ,i.e. is certain strings, which have a special structure, like "first two chars are digits, then 3 to 5 captal letters,..." or sth like that – SpghttCd Sep 04 '18 at 07:15
  • I'd advise you to find an online regex tester, like this you might easily test how your regex is behaving. – Dominique Sep 04 '18 at 07:16
  • See [how do i replace a query with a new value in urlparse?](https://stackoverflow.com/questions/26221669/how-do-i-replace-a-query-with-a-new-value-in-urlparse) – Wiktor Stribiżew Sep 04 '18 at 07:16
  • @SpghttCd I don't know what will be the value after sid= it could be any number so I have to use regex. – tupac shakur Sep 04 '18 at 07:17
  • Ok, but `.` stands for any character, not only numbers. Aren't you just serching for a string containing nothing but numbers? – SpghttCd Sep 04 '18 at 07:19
  • What about 're.sub('sid=\"\d+\"', replace_small_sid, content)' – SpghttCd Sep 04 '18 at 07:33
  • @SpghttCd - nope it does not work , I tried this: replace_small_sid = str('sid=\\"' + str(sid) + "\\\"") re.sub('sid=\\\"(.+?)', replace_small_sid, content) which produce the following result: sid=\"1900174890156032\"1722407313768658\" and did not change the string: 1722407313768658 but added my replacement. – tupac shakur Sep 04 '18 at 08:00
  • I used this regex: 'sid=\\\"(.+?)\\"' which caught me the number but it did not replace it by what i want : content = re.sub('sid=\\\"(.+?)\\"', replace_small_sid, content) to this string: 1900117189066752 – tupac shakur Sep 04 '18 at 08:13

2 Answers2

0

since you want to replace specific string, you can do it by:

content.replace("1722407313768658","1900117189066752")
Gautam Kumar
  • 525
  • 4
  • 9
  • My first thought, too, but OP already made clear that they in fact want the general substitution, i.e. it's really a regex case. – SpghttCd Sep 04 '18 at 07:23
  • @Gautam Kumar - I want to be able to catch any string containing digits and replace it in another string containing digits. – tupac shakur Sep 04 '18 at 07:24
0

As I understand it you wish to match string patterns in the form:

sid=\"1722407313768658\"

With the aim of replacing the digits.

To achieve this we can use positive lookbehinds and lookaheads as described here: https://www.regular-expressions.info/lookaround.html

Lookahead and lookbehind, collectively called "lookaround", are zero-length assertions just like the start and end of line, and start and end of word anchors explained earlier in this tutorial. The difference is that lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not.

In this case our lookbehind will match

sid=\"

Our lookahead will match

\"

Please see the example here: https://regex101.com/r/2pXcMI/2

Finally, we can use this to match and replace as follows:

import re
line = "sid=\"1722407313768658\" safklabsf ipashf oiasfoi asbg fasnk sid=\"65641\" asjobfaosb asbfaosb asf asfauv sid=\"651564165\"."
replace_with = '1900117189066752'
line = re.sub('(?<=sid=\\\")\d+(?=\\\")', replace_with, line)
line

This returns

'sid="1900117189066752" safklabsf ipashf oiasfoi asbg fasnk sid="1900117189066752" asjobfaosb asbfaosb asf asfauv sid="1900117189066752".'

Bav Malhi
  • 56
  • 5
  • It's a bit strange but when I try to replace the string it does not let me... although by testing the regex I saw that it caught the number for some reason it did not change it to the desired string. very strange! – tupac shakur Sep 04 '18 at 10:56
  • If you're able to copy the code you used here it shouldn't (hopefully) be hard to debug. – Bav Malhi Sep 04 '18 at 13:47
  • That's the problem I debugged my code my regex actually catch what it is suppose to catch but when I want to replace the strings it doesn't do anything... – tupac shakur Sep 04 '18 at 14:03
  • Are you assigning re.sub(pattern, replacement, original_string) to an object? For example: line = re.sub('(?<=sid=\\\")\d+(?=\\\")', replace_with, line) – Bav Malhi Sep 04 '18 at 14:09
  • Unless you can provide the exact code and similar data such that such that the error is reproducible, I'm not sure what else to suggest. – Bav Malhi Sep 06 '18 at 13:30