-6
"foldGroup.registerImage({ domId: 'listimg7', srcUrl: 'https://ec.yimg.com/ec/?url=https%3A%2F%2Fd3vv6xw699rjh3.cloudfront.net%2F9f689b-1904037587_1_160.jpg&t=1460964135&ttl=43200&maxWidth=160&maxHeight=160&sig=QSY1BP0sCebMxqEN6irjXQ--~C' });"

This is a part of the html from a yahoo shopping page, like:
https://shopping.yahoo.com/womens-intimate-apparel/?b=3937

My question is how to find all the img urls using the Python's re.findall()?

martineau
  • 119,623
  • 25
  • 170
  • 301
mingxin zhao
  • 109
  • 9

1 Answers1

2
re.findall(r"'https://.*?'", part_of_html)

re.findall(pattern, string, flags=0) Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.

su79eu7k
  • 7,031
  • 3
  • 34
  • 40