python script to strip white spaces

Question

I'm fairly new to python looking for an help! on this I have this string which has a xml content. I need to strip white spaces in between different tags.

<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>

afterwards it looks like:

<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Too many concurrent login(s)</TEXT></RESPONSE></SIMPLE_RETURN>

Appreciated if anyone can help!!

Possible duplicate of [Remove whitespaces in XML string](https://stackoverflow.com/questions/3310614/remove-whitespaces-in-xml-string) — Bill the Lizard, May 09 '18 at 13:10

score 1 · Answer 1 · answered May 09 '18 at 13:32

If you don't want to use regex, you could do this: (It also looks easier to me for someone new to understand how it works, but I am not aware if this is the best way to do it)

my_str = '<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>'
new_str = ''
for character in my_str:
    if character != ' ':
        new_str = new_str + character

And then, if you do:

print(new_str)

the output is:

'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'

A second way I can come up with is this:

new_str = ''.join(my_str.split())

It says 'split my_str at white spaces and then join the pieces that result from this with no character in between'. The output of print is the same.

Hope this helps, but again, I am not aware if these are the best ways to do it.

score 1 · Answer 2 · answered May 09 '18 at 14:16

Another way to do it:

k = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"
k.replace(" ","")
'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'

score 0 · Accepted Answer · answered May 09 '18 at 13:14

Use regex.

Ex:

import re
s = """<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"""
print(re.sub("([\>])\s+([\<])", "\g<1>\g<2>", s))

score 0 · Answer 4 · answered May 09 '18 at 13:15

You can use the sub regex function:

import re

string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

result = re.sub(r'> +<', '><', a)
print result

score 0 · Answer 5 · answered May 09 '18 at 13:39

0

Here you go :

import re
str = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

str = re.sub("\>\s+",">", str)

answered May 09 '18 at 13:39

Deepak Dixit

1,510
15
24

Akshay Apte · Answer 6 · 2018-05-09T13:17:44.733

-1

I think it's fairly simple. You just need to get a regex to match whitespace between the tags

str string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>
" 
string = re.sub(r">(\s+)<","><",string)

edited May 09 '18 at 13:17

answered May 09 '18 at 13:10

Akshay Apte

1,539
9
24

python script to strip white spaces

6 Answers6