It seems that in the environment you have, the PCRE library was compiled without the PCRE_NEWLINE_ANY
option, and $
in the multiline mode only matches before the LF symbol and .
matches any symbol but LF.
You can fix it by using the PCRE (*ANYCRLF)
verb:
'~(*ANYCRLF)\S+(?=\*$)~m'
(*ANYCRLF)
specifies a newline convention: (*CR)
, (*LF)
or (*CRLF)
and is equivalent to PCRE_NEWLINE_ANY
option. See the PCRE documentation:
PCRE_NEWLINE_ANY
specifies that any Unicode newline sequence should be recognized.
In the end, this PCRE verb enables .
to match any character but a CR and LF symbols and $
will match right before either of these two characters.
See more about this and other verbs at rexegg.com:
By default, when PCRE is compiled, you tell it what to consider to be a line break when encountering a .
(as the dot it doesn't match line breaks unless in dotall mode), as well the ^
and $
anchors' behavior in multiline mode. You can override this default with the following modifiers:
✽ (*CR)
Only a carriage return is considered to be a line break
✽ (*LF)
Only a line feed is considered to be a line break (as on Unix)
✽ (*CRLF)
Only a carriage return followed by a line feed is considered to be a line break (as on Windows)
✽ (*ANYCRLF)
Any of the above three is considered to be a line break
✽ (*ANY)
Any Unicode newline sequence is considered to be a line break
For instance, (*CR)\w+.\w+
matches Line1\nLine2 because the dot is able to match the \n, which is not considered to be a line break. See the demo.