-2

As simple as I can make it:

I have written a python script to extract embed links from an api. I can easily return a list of embed links similar to this:

[<embed>www.example.com/embed/4657889</embed>, <embed>www.example1.com/embed/789465/</embed>, <embed>www.example2.com/embed/132456/</embed>]

But what I would like to do next is take this returned list and replace every <embed> with <embed src=" as well as replace every </embed> with "> ultimately creating a new list that looks like this:

[<embed src="www.example.com/embed/4567889/>, <embed src="www.example1.com/embed/789456/>, <embed src="www.example.com/embed/123456/>]

But as you can see, the word 'embed' is also in the url itself so I have to make sure not to touch that use of the word. I've tried replace(), trip(), a for loop, all with no luck. Anyone have any ideas of how I could implement this? Thank you ahead of time and hope everyone is staying healthy!

Jan
  • 42,290
  • 8
  • 54
  • 79
WBR
  • 1
  • 1

1 Answers1

0

Use a regular expression

import re

lst = ["<embed>www.example.com/embed/4657889</embed>",
      "<embed>www.example1.com/embed/789465/</embed>",
      "<embed>www.example2.com/embed/132456/</embed>"]

rx = re.compile(r'<embed>(.+?)</embed>')

new_lst = [rx.sub(r'<embed src="\1">', item) for item in lst]
print(new_lst)

Which yields

['<embed src="www.example.com/embed/4657889">', '<embed src="www.example1.com/embed/789465/">', '<embed src="www.example2.com/embed/132456/">']
Jan
  • 42,290
  • 8
  • 54
  • 79
  • Or without regex: `'www.example.com/embed/4567889/'.replace('', '')` – Matthias Apr 08 '20 at 18:53
  • Why use regex instead of an actual XML parser? – AMC Apr 08 '20 at 19:00
  • @AMC: Imo, a parser is a bit overkill when OP has already the values in question in a list. But nothing wrong in using a parser, no. – Jan Apr 08 '20 at 19:01
  • That's true, although my assumption was that they would rewrite the rest of their program to make use of the XML parser. – AMC Apr 08 '20 at 19:02
  • @AMC: That would be the better alternative, yes. – Jan Apr 08 '20 at 19:03
  • Wow, that worked like a charm!! Can't thank you enough. I guess I have a new chore. To go do some studying on python regex! – WBR Apr 08 '20 at 19:41
  • @WBR: Please upvote/accept the answer if it helped you. – Jan Apr 08 '20 at 19:42