1

I have the following text file:

a
a

I am trying to match lines that start with a using the following command: pcregrep -M '^a'. It is matching only the first a and not the second. Does anyone know why? I am using pcregrep because this is a simple problem that I'm expanding to more complex scenarios later.

Thanks!

UPDATE

The reason is that I'm using Mac OS, in which every newline is a carriage return. Because of this, pcregrep interprets the file contents (which is a\ra) as one line, and my regex returns just the first a from that line since that's all I've specified in the expression. The solution to this, with pcregrep, is to specify the newline type. 'Newline type' means the character that the Regex engine interprets as designating the end of a line. Thus, if we specify in this case that the newline type is carriage return (\r), pcregrep will interpret my file's contents as two lines, and will match and return both.

The fixed version of my Regex is pcregrep -M -N CR '^a', where -N CR means "newline type is carriage return".

gkeenley
  • 6,088
  • 8
  • 54
  • 129

1 Answers1

1

Since you created a text file in MasOS, the line endings are represented with CR (carriage return) symbol (\r, \x0D, a char with Index 13 in the ASCII table).

By default, pcregrep and other suchlike tools consider \n, an LF (line feed) symbol as the line break char.

You should tell pcregrep to use CR as line break chars using -N option:

pcregrep -o -N CR '^a' file
            ^^^^^
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563