4

I need a simple unix utility that would allow me to search within files quickly and with basic parameters (this folder, not this kind of files...).

Currently I use a hand crafted find function with grep and many parameters. It's fast enought on small folders. The problem is that I mainly work with one folder which contains about 300k files and then it's too slow.

What I'm looking for is a small tool that would index the content of the files in this directory (text files) on demand and allow me to search within this index (and of course display the relative content).

What I'm looking for is Agent ransack for unix systems in CLI.

I would like, if possible, not to have to install much. Sphinx for example is too much of a hassle, I need a lightweight alternative.

Thanks for your suggestions.

kevin
  • 143
  • 7

3 Answers3

1

Before you are going to set up something more complex, I have to ask have you already tried ack. It's like grep but designed to meet its shortcomings; ack automatically searches only through text files and skips the binaries and so on.

See ack homepage (if it's up and running, right now it doesn't seem to work for me) or install it via a package manager if your distro has it available and give it whirl.

Some version of ack homepage seems to be in Google cache, too.

Janne Pikkarainen
  • 31,852
  • 4
  • 58
  • 81
  • Somehow ack-grep is much faster than the find+grep combination. It was not easy to find the source but I was able to download a standalone version from the google cache : http://webcache.googleusercontent.com/search?q=cache:cgRJDN5UxvoJ:betterthangrep.com/ack-standalone+ack+ack-standalone+1.94 – kevin Aug 05 '11 at 13:21
  • See, I told you! :) It's intelligent while it searches for files and it shows. – Janne Pikkarainen Aug 05 '11 at 13:22
  • It's a great recommendation, but it's still not fast enough, took 8m27.187s to search for something in my folders (vs 11m19.761s for find+grep). – kevin Aug 05 '11 at 13:45
1

locate (or workalikes) comes with many Linux systems. It scans the filesystem on a daily basis, so if you are not looking for a realtime solution, this might be the tool for you.

My Fedora workstation and CentOS servers come with mlocate, but there are several other flavours as well.

Sgaduuw
  • 1,833
  • 12
  • 16
0

It depends how much time equates to "too much of a hassle", as you are either going to be looking for a runtime solution or something that will mine each file and construct a database from the answers.

ack-grep, as suggested by Janne Pikkaraiined looks like a useful tool in the former category.

tracker (see website) is worth looking at as a not-necessarily global desktop search with cli tools but has wierd query syntax (at least to my eye); eg

$> tracker-sparql -q "SELECT nie:url(?f) WHERE { ?f fts:match 'red OR blue yellow' }"

recoll looks like it might have a more understandable search syntax, and be more customisable than tracker. However the CLI tool isn't built by default. Interestingly, you can also build a Python api.

This article on linux.com is interesting.

rorycl
  • 848
  • 1
  • 6
  • 10