1

hi I know I can use egrep -w 'The|the' f3.txt to display all the lines with the word the but I want to display the lines where only the first word is "The" or "the"

1 Answers1

1

Use the following regular expression:

grep -E '^(The|the)\s' texto.txt

For lines that end with dot:

grep -E '^(The|the)\s.+\.$' texto.txt

Add the ^ to inform you that the word should be at the beginning.

Valdeir Psr
  • 702
  • 1
  • 7
  • 10
  • `grep -P` is extremely nonportable (a GNU-only option that's only available if linked with libpcre, and so isn't always available even when guaranteed to be using modern GNU grep). No reason to provide an answer that's restricted to it when one compatible with POSIX grep is possible. – Charles Duffy Dec 27 '17 at 04:38
  • Also, this will match `Theo` -- it isn't performing a word-level match. – Charles Duffy Dec 27 '17 at 04:38
  • Better, as amended, but `\s` isn't available in standard ERE either. Some libc implementations may provide it as an extension, but the safe option is to use `[[:space:]]` -- or, more appropriately, `([[:space:]]|$)` (since `The` is still a word even if it's the *only* word on a line). – Charles Duffy Dec 27 '17 at 04:41
  • (Also, right now, the `\s` is only on the `the` branch, not on the `The` branch). – Charles Duffy Dec 27 '17 at 04:44
  • And what if I wanted to display all the lines that contain a "." at the end? – BetaCrasher Dec 27 '17 at 04:45
  • Btw I also didn't understand what the -E does in the first command – BetaCrasher Dec 27 '17 at 04:46
  • @BetaCrasher, the `-E` specifies ERE syntax ("extended regular expressions"), not default BRE syntax. ERE is generally the right thing -- it's (unlike PCRE) guaranteed to be available everywhere, and (unlike BRE) it's not awful. – Charles Duffy Dec 27 '17 at 04:47
  • That said, this is likely not to work on several platforms account of the use of (PCRE-only) `\s`, which *really* should be `[[:space:]]`. – Charles Duffy Dec 27 '17 at 04:47
  • @BetaCrasher, ...as for searching for a `.` at the end, that would be something like `grep -E '[.]$'` – Charles Duffy Dec 27 '17 at 04:48
  • Why have square brackets when you had round in the first one does it not make a diference – BetaCrasher Dec 27 '17 at 04:49
  • BTW, `egrep` is exactly the same thing as `grep -E`. – Charles Duffy Dec 27 '17 at 04:50
  • @BetaCrasher, the square brackets set up a character class -- that is to say, an expression with square brackets will match exactly one character unless modified by something that changes the number of times it's matched. A character class containing only `.` is a way to match a period that is less sensitive to escaping issues than backslash-escaping `'\.'`. (Backslashes can get messy, especially when nesting different kinds of quoting contexts; avoiding them is cleaner). – Charles Duffy Dec 27 '17 at 04:50
  • Thank u this helped very much – BetaCrasher Dec 27 '17 at 04:53
  • @BetaCrasher, ...so, to be clear, the bracket type *absolutely* makes a difference. In `(.)`, it's matching any character (`.` is a wildcard in regex). In `[.]`, it's matching only one character, the period. Similarly, `[ao]` matches one character, either an `a` or an `o`, whereas `(ao)` only matches the string `ao` in that exact order. – Charles Duffy Dec 27 '17 at 04:53