2

I'm not very good with regular expression, so I wanted you help me with an expression to replace only content that is between the double quotes in the src attribute of the html tag, ie, the content of this attribute something like this:

TRegEx.Replace(Str, '(?<=<img\s+[^>]*?src=(?<q>[""]))(?<url>.+?)(?=\k<q>)', 'Nova string');

In others words: src="old content" => src="new content"

I had seen this expression above in a question about this same subject in C#, but don't work on Delphi.

So, how do this?

Thx in advance.

2 Answers2

3

Regex:

<img(.*?)src="(.*?)"(.*?)/>

Substitution:

<img$1src="NEW VALUE"$3/>

You can use something like:

var
    ResultString: string;

ResultString := '';
try
    ResultString := TRegEx.Replace(SubjectString, '<img(.*?)src="(.*?)"(.*?)/>', '<img$1src="NEW VALUE"$3/>', [roIgnoreCase, roMultiLine]);
except
    on E: ERegularExpressionError do begin
        // Syntax error in the regular expression
    end;
end;

Regex Explanation:

<img(.*?)src="(.*?)"(.*?)/>

Options: Case insensitive; Exact spacing; Dot doesn’t match line breaks; ^$ match at line breaks; Numbered capture; Allow zero-length matches

Match the character string “<img” literally (case insensitive) «<img»
Match the regex below and capture its match into backreference number 1 «(.*?)»
   Match any single character that is NOT a line break character (line feed, carriage return, form feed, vertical tab, next line, line separator, paragraph separator) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character string “src="” literally (case insensitive) «src="»
Match the regex below and capture its match into backreference number 2 «(.*?)»
   Match any single character that is NOT a line break character (line feed, carriage return, form feed, vertical tab, next line, line separator, paragraph separator) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “"” literally «"»
Match the regex below and capture its match into backreference number 3 «(.*?)»
   Match any single character that is NOT a line break character (line feed, carriage return, form feed, vertical tab, next line, line separator, paragraph separator) «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character string “/>” literally «/>»

<img$1src="NEW VALUE"$3/>

Insert the character string “<img” literally «<img»
Insert the text that was last matched by capturing group number 1 «$1»
Insert the character string “src="NEW VALUE"” literally «src="NEW VALUE"»
Insert the text that was last matched by capturing group number 3 «$3»
Insert the character string “/>” literally «/>»

Regex101 Demo

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
1

SOLUTION:

    procedure CHANGE_IMAGES(Document: IHTMLDocument2);
    var
      I: Integer;
      HTMLImgElement: IHTMLImgElement;
      HTMLElementCollection: IHTMLElementCollection;
    begin

      HTMLElementCollection := Document.images;

      for I := 0 to HTMLElementCollection.length - 1 do
      begin
        HTMLImgElement := (HTMLElementCollection.item(I, 0) as IHTMLImgElement);
        HTMLImgElement.src := 'My_IMAGE_PATH_OR_URL';
        Exit;
      end;
    end;