0

So recently I was searching on command line tools that perform fast search and I stumbled upon a lot . Out of those it is my understanding that Ag is reportedly faster than grep, ack, and sift. With grep being the slowest.

now I have 300.000 strings on a file and I try to find which strings have a specific substring and return them back.

time grep 'substring' file.txt

real    0m0.030s
user    0m0.009s
sys     0m0.008s

.

time ag 'substring' file.txt ----> 5 secs

real    0m0.083s
user    0m0.038s
sys     0m0.014s

Am I doing something wrong , or ag is not used the way I am trying to use it?

RetroCode
  • 342
  • 5
  • 14
  • One key consideration here is that these two command line invokations don't actually do the same thing. The `ag` invokation does case-insensitive matching *and* counts lines; the `grep` one does neither. – Thomas Orozco Sep 25 '16 at 22:21
  • @ThomasOrozco but ag is described as a "faster alternative to grep" .. If the function of ag is different to that of grep.. then ag is faster on what ? – RetroCode Sep 25 '16 at 22:36
  • 3
    Ag is designed for searching code bases. It's faster than grep there because it's more efficient at locating files that should be searched and processing them in aggregate, etc. In other words grep and Ag are different tools meant for different purposes. The contrived example you provided is simply one where grep shines and Ag doesn't. There's a lot to be said about this topic, and there really isn't a short answer. [This blog post](http://blog.burntsushi.net/ripgrep) (which focuses on another tool: `rg`) is a good investigation of the complexity of determining what "faster" means. – Thomas Orozco Sep 25 '16 at 23:01

1 Answers1

0

Grep is really efficient. However, even if Ag is faster on one system, it all comes down to which package and distribution you are using.

This, if you are using 64 bit on grep (on a Cygwin package), you can utilize more memory on the system. It could be that Ag has a package that is using less resources.

I would recommend using the parallel command, which allows you to specify how many processes per core to use and speed things up.

DomainsFeatured
  • 1,426
  • 1
  • 21
  • 39