0

I'm fairly new to python looking for an help! on this I have this string which has a xml content. I need to strip white spaces in between different tags.

<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>

afterwards it looks like:

<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Too many concurrent login(s)</TEXT></RESPONSE></SIMPLE_RETURN>

Appreciated if anyone can help!!

Shenali Silva
  • 127
  • 3
  • 9
  • 2
    Possible duplicate of [Remove whitespaces in XML string](https://stackoverflow.com/questions/3310614/remove-whitespaces-in-xml-string) – Bill the Lizard May 09 '18 at 13:10

6 Answers6

1

If you don't want to use regex, you could do this: (It also looks easier to me for someone new to understand how it works, but I am not aware if this is the best way to do it)

my_str = '<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>'
new_str = ''
for character in my_str:
    if character != ' ':
        new_str = new_str + character

And then, if you do:

print(new_str)

the output is:

'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'

A second way I can come up with is this:

new_str = ''.join(my_str.split())

It says 'split my_str at white spaces and then join the pieces that result from this with no character in between'. The output of print is the same.

Hope this helps, but again, I am not aware if these are the best ways to do it.

Dora
  • 374
  • 3
  • 8
1

Another way to do it:

k = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"
k.replace(" ","")
'<SIMPLE_RETURN><RESPONSE><DATETIME>2018-05-09T12:47:24Z</DATETIME><CODE>2014</CODE><TEXT>Toomanyconcurrentlogin(s)</TEXT></RESPONSE></SIMPLE_RETURN>'
Christoffer
  • 528
  • 2
  • 15
0

Use regex.

Ex:

import re
s = """<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"""
print(re.sub("([\>])\s+([\<])", "\g<1>\g<2>", s))
Rakesh
  • 81,458
  • 17
  • 76
  • 113
0

You can use the sub regex function:

import re

string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

result = re.sub(r'> +<', '><', a)
print result
Yassine Faris
  • 951
  • 6
  • 26
0

Here you go :

import re
str = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>"

str = re.sub("\>\s+",">", str)
Deepak Dixit
  • 1,510
  • 15
  • 24
-1

I think it's fairly simple. You just need to get a regex to match whitespace between the tags

str string = "<SIMPLE_RETURN>  <RESPONSE>    <DATETIME>2018-05-09T12:47:24Z</DATETIME>    <CODE>2014</CODE>    <TEXT>Too many concurrent login(s)</TEXT>  </RESPONSE></SIMPLE_RETURN>
" 
string = re.sub(r">(\s+)<","><",string)
Akshay Apte
  • 1,539
  • 9
  • 24