-1

I have a csv file in my home directory as below

cat try.csv
val1,val2,val3,val4,val5,val6
10-Jul-19,12604876601113439,Self,abs,Tier-I,30088.5
09-Jul-19,12604876601112397,Self,abs,Tier-I,200590
08-Jul-19,12604876601111807,Self,abs,Tier-I,200590
05-Jul-19,12604876601109069,Self,abs,Tier-I,70206.5
29-May-19,12604876601085648,Self,cdf,Tier-I,70206.5
30-Apr-19,12604876601068094,Self,cdf,Tier-I,130383.5
15-Nov-18,12604876600900949,Self,xyz,Tier-I,71209.46
10-Oct-18,12604876600887501,Self,xyz,Tier-I,79233.06

I can use grep command to extract the rows having word 'abs':

grep -w 'abs' try.csv
10-Jul-19,12604876601113439,Self,abs,Tier-I,30088.5
09-Jul-19,12604876601112397,Self,abs,Tier-I,200590
08-Jul-19,12604876601111807,Self,abs,Tier-I,200590
05-Jul-19,12604876601109069,Self,abs,Tier-I,70206.5

However I have come across ripgrep in https://blog.burntsushi.net/ripgrep/ which claims to have similar functionalities of grep but performs faster (my actual csv file is very big with size 30gb so I need for some faster way than grep)

So I installed ripgrep with cargo install ripgrep and run below code

ripgrep -w 'abs' try.csv

But I got below error

Command 'ripgrep' not found, did you mean:

  command 'sipgrep' from deb sipgrep
  command 'zipgrep' from deb unzip

Try: apt install <deb name>

Any pointer to use ripgrep in the correct way will be helpful

Bogaso
  • 2,838
  • 3
  • 24
  • 54
  • please let us know if it is any faster. Good luck. – shellter Jul 21 '19 at 22:42
  • I dont see huge improvement. But I am open to any other option for faster processing of 27gb file – Bogaso Jul 22 '19 at 04:11
  • If you could get the file reorganized so you could use either beginning of line anchor or end-of-line anchor reg-ex (`^` or `$`), that might help a little. Or have it delivered in multiple parts that you can then use `gnuparallel` to run multiple greps. You're not going to get much faster than basic grep, its optimized C code (for 40+ years now). Good luck. – shellter Jul 22 '19 at 13:08
  • 1
    ripgrep is faster than basic grep in a lot of scenarios. Please see the benchmarks. However, if your file is 27GB and doesn't all fit into memory, then you will likely just be blocked on the speed of your underlying I/O device. ripgrep cannot magically make that faster. – BurntSushi5 Jul 27 '19 at 19:53

1 Answers1

1

The program is called ripgrep but it is invoked by rg. All you need to do is:

rg -w 'abs' try.csv
matusf
  • 469
  • 1
  • 8
  • 20