Tool to search text in big files

Question

I'm looking for a tool to find a line containing a searced text inside a file that has a size of 4 GB

Is this for a one off search or will you have to do it repeatedly? Is the search string always the same or can it vary? — gm3dmo, Jan 08 '10 at 18:27
Baregrep is old but still useful: http://www.baremetalsoft.com/baregrep/index.php — endo64, Apr 05 '21 at 19:11

score 8 · Answer 1 · answered Jan 08 '10 at 18:51

8

If you have powershell installed you could use

select-string -pattern <your_string> -path <path_to_file>

It probably won't be fast, but won't choke like find or findstr probably will.

answered Jan 08 '10 at 18:51

Zypher

37,405
5
53
95

1

Tried powershell on hash password list (25GB), but after couple of minutes stopped it. Ended up using bareTail to instantly scroll to the first bytes of the hash and page-scroll it from there. – Volodymyr Kotylo Nov 19 '20 at 21:33

score 3 · Accepted Answer · answered Jan 08 '10 at 18:01

3

When you use *nix, you could also use split to get smaller files and then parse it with whatever you like, be it grep, awk, ...

answered Jan 08 '10 at 18:01

Christian

4,703
2
24
27

score 2 · Answer 3 · answered Jan 08 '10 at 17:27

2

Use grep. Nothing will be any fast.

answered Jan 08 '10 at 17:27

bmargulies

2,333
3
17
20

score 1 · Answer 4 · answered Jan 08 '10 at 17:38

Not free, but if this is a text file, then BareTailPro might do the job. I have used it to search for text in logfiles which were too large to fit in memory. One advantage is that it does not just show the text, but allows you to jump into the text file, so you can see the lines before and after the hits.

score 1 · Answer 5 · answered Jan 08 '10 at 17:45

1

If you do it often enough, and the file is broken into logical lines, you could load it into Splunk and search from there. It'll do indexing for you to be quick(er).

answered Jan 08 '10 at 17:45

Bill Weiss

10,979
3
38
66

score 1 · Answer 6 · answered Jan 08 '10 at 17:46

1

I'd use grep on *nix, and I'd use a higher end text editor (e.g Notepad++) on Windows.

answered Jan 08 '10 at 17:46

Satanicpuppy

5,946
1
17
18

I tried opening a 2 GB CSV in Notepad++ and it blew up nicely :-) – Chris_K Jan 08 '10 at 20:53
On Windows you can use Foxe to open huge files, http://www.firstobject.com/dn_editor.htm don't confuse it is named XML editor but it is also a text editor. Very fast. – endo64 Feb 03 '21 at 16:46

score 1 · Answer 7 · answered Jan 08 '10 at 18:28

It depends on the application, the response time needed, and what you're willing to do to meet those goals.

Recently, I was working with a 10+ GB, 50+ million line text file and had a need to search for specific strings in each line. Standard Unix tool "grep" did the trick, but took an unacceptably long time (multiple minutes). I imported the text into a postgreslq DB (it was a CSV file, easily imported), and once indexed on the key I needed to search on, it took under 1 second to find my record.

Granted, my workstation is single-core, with only 4GB RAM, a 4-year-old 2GHz CPU, and a top-heavy filesystem (ZFS) using 5+ year-old consumer PATA drives. Your mileage will certainly vary. Still, the time difference between the two methods is staggering.

If your data is free-form text, you might still consider importing into a DB which supports full-text search and indexes appropriately to support such searches.

Even if you have the RAM to have the entire file cached and a fast machine, doing a linear search of files this size will be time inefficient, depending (once again) on the application.

score 0 · Answer 8 · answered Jan 08 '10 at 18:44

0

At the Windows CMD prompt, there are two commands FIND and FINDSTR. They will probably choke on a file that size or be very slow, but you already have them.

Type help findstr and help find for documentation.

answered Jan 08 '10 at 18:44

Dennis Williamson

62,149
16
116
151

Tool to search text in big files

8 Answers8