I try to fish all URL from a text and Recognize entities kind of works good with "Entity type" = "URL". but this fails when here are certain special characters in the URL like ' or ç.
With this:
<url><loc>https://www.example.com/what's-this-long-text-willbetruncated</loc></url>
<url><loc>https://www.example.com/françois-is-here-and-not-there</loc>
the results are:
https://www.example.com/what
https://www.example.com/fran
I tried changing around the "language" setting of the Recognize Entities function, didn't help at all.
Do I have to go for find with regex? can be bit of a pain for URLs I learned. Thank you