Ansible: extracting a string between two strings

Question

So I have an html file that contains the following somewhere in the middle:

<span dir="ltr">http:(...).com</span>

I'm attempting to extract the url, but am having some issues doing so. Because that "ltr" is the only one that exists in the html, I came up with this regex:

(?<=ltr">)(.*)(?=<\/span>)

Using regex101 I confirmed that the regex expression works. However, because of how ansible deals with quotes and double quotes, I think it may be causing some issues.

I'm trying it like this:

    - set_fact:
       regex_test: " {{ htmlres.content | regex_search('(?<=ltr">)(.*)(?=<\/span>)') }}"

Where htmlres.content is the html content received from an http get request done previously in the same playbook. However, running it:

    - set_fact:
       regex_pubdest: " {{ htmlres.content | regex_search('(?<=ltr">)(.*)(?=<\/span>)' }}"
                                                                    ^ here

Is there any way to circumvent this issue with quotes in regex in ansible? I've managed to achieve the desired output by doing something slightly different, which is this:

 shell:  grep -oP 'ltr">\K.*?(?=</span>)' /dir/htmlcontent.txt

The issue is the previous only works when reading from a file, and I'm trying to avoid saving the html.content to a file before passing a regex through it. I've tried replacing the path to the folder in the grep with "{{html.content}}", but unfortunately that causes ansible to not run correctly due to the quotes.

Any ideas?

Thank you!

What happens when you are "running it"? You just reposted the same task again. I think you omitted some important text. — Michael Hampton, Aug 18 '21 at 01:11

Ansible: extracting a string between two strings

0 Answers0