0

I am trying to use grep "regex" path_to_file, and this is my approach:

grep "(\/bbcswebdav\/.*_2)" file.txt

(escaping / is what regex101 promoted me to do)

There were no outputs in the console.

However, when I tested the regular expression on regex101, I received my desired output.

The desired results here are all strings from /bbcswebdav/ to _2, e.g. /bbcswebdav/pid-816174-dt-content-rid-8436387_2/xid-8436387_2

But when I do

grep "/bbcswebdav/.*_2" file.txt

The same goes with grep -E, suggested in the comments.

grep -E "/bbcswebdav/.*_2" file.txt

the output will be very messy, in this case: <li><a href="/bbcswebdav/pid-816174-dt-content-rid-8436387_2/xid-8436387_2" target="_blank"><img src="/images/ci/ng/cal_year_event.gif" alt="file">&nbsp;5 diversity-name.pptx</a>

Therefore my questions are:

  1. What might have gone wrong in my command line input?
  2. What are some better alternative regex? (or approach, in general)

Thank you.

Ian Hsiao
  • 85
  • 1
  • 9
  • `grep` uses basic regular expressions by default, `()` has no special meaning. Use `grep -E` for extended regular expressions. – Barmar Feb 23 '22 at 04:24
  • But there's no need to use a capture group in the first place. Also, you don't have to escape slash in `grep`. – Barmar Feb 23 '22 at 04:25
  • @Barmar Isn't `()` used to extract the desired section of the output? If I use `grep "/bbcswebdav/.*_2" file.txt`, the result will be super messy, in my case `
  • file 5 diversity-name.pptx`
  • – Ian Hsiao Feb 23 '22 at 04:44
  • So, does it work for `grep -E` ? – Jerry Jeremiah Feb 23 '22 at 05:18
  • @JerryJeremiah It worked with `grep -E`, but not as expected (the one displayed in the second image) – Ian Hsiao Feb 23 '22 at 05:26
  • @IanHsiao if you want to extract the text matched by the regex, use `-o` option as well. By default, `grep` displays the whole matching line. Also, I'd suggest to use `/bbcswebdav/[^"]*_2` to avoid getting wrong results if you have `_2` later in the line. Or better yet, use html/xml parsing tool instead of regex. – Sundeep Feb 23 '22 at 06:07
  • @Sundeep yes, this is related. Thanks! – Ian Hsiao Feb 23 '22 at 10:17