3

Currently when I have to search for complex patterns in code, I typically use a combination of find and grep in the form:

find / \( -type f -regextype posix-extended -regex  '.*python3.*py' \) -exec grep -EliI '\b__[[:alnum:]]*_\b' {} \; -exec cat {} \; > ~/python.py

While this looks a long term to type, its actually quite short if you use zsh. I just type f (the first character), and go directly to this command from my command history. Further the regex in find/grep is standardized and tested, so there are no surprises, or missing searches.

ripgrep/ag etc etc are new software, which mightnot be supported a few years down the line when the original maintaner loses interest.

  1. is there any plan to include .gitignore rules or optimizations in ag/ack/rg in grep/other version of grep? Is there any reason why these optimizations were/are not going to be included in grep?

  2. For those of you who switched over: Did you guys find it worthwhile to switch over to rg/ag/ack especially because there is going to be a learning curve for these tools as well?

alpha_989
  • 4,882
  • 2
  • 37
  • 48
  • 3
    I started with ack and there's not much of a learning curve. You just do `ack '\b__[[:alnum:]]*_\b'` (or `ack --python '\b__[[:alnum:]]*_\b'` to restrict the search to python files). – melpomene Sep 23 '17 at 20:47
  • 2
    `ack` is easy to use: the main complication/feature relative to `grep` is that it uses perl regular expressions instead of POSIX ones. Another difference is that, while `grep` is an excellent general purpose tool, `ack` is specialized to serve the needs of programmers. – John1024 Sep 23 '17 at 21:08
  • Thanks for your comments. The regex I use now, is relatively simple. Every week, it seems I need more complex regex searches. I also heard of grep -P, not sure whether something similar exists for find. Do you think PCRE is better than posix regex in terms of complexity of searches it can enable, if I need them in future? – alpha_989 Sep 23 '17 at 21:23
  • I do have to search code as well as files containing various types of data (tif/csv/txt/propreitary file types) etc all the time. So another argument for me, going against ack/ag/rg is that I should learn a few tools well, rather than a lot of different tools. – alpha_989 Sep 23 '17 at 21:24
  • The one things missing in the above search is that ack/ag ignores .git and/or can understand .gitignore. I can exlude .git by a command line switch to find, but not sure how I would tell find/grep to exclude files in .gitignore files. – alpha_989 Sep 23 '17 at 21:30

1 Answers1

3

Use ag.

The key part of your example: ag -G '.*python3.*py' '\b__[[:alnum:]]*_\b'

Ag is here to stay and uses Perl regex (PCRE) which is far more flexible than POSIX basic or extended Regular Expressions. Grep -P uses the Perl regex engine, so this just akin to using ag, without some of the later's more modern features. Likewise, ack is like ag but is slower (though admittedly has a few more bells and whistles). Ag's file regexes filtering (the -G flag as exemplified above) and built-in file types filters are very handy (e.g. --python). The recently renamed .ignore file also provides finer tuning.

Since most modern scripting languages have PCRE or handle regexes with similar features in PCRE (perl, python, ruby), as do many full languages (java, C++) have near equivalent feature sets (e.g java.util.regex, Boost.Regex), I consider this the main reason to switch. Moreover, it is satisfying to unify your programming with you commandline skillset.

From my point of view, ripgrep is ag's main contender because it is faster and has an easy way to add file types. That said, it doesn't have as flexible a regex engine: no backreferences nor look-arounds. With this is mind, I recommend Ag.

gregory
  • 10,969
  • 2
  • 30
  • 42