1

It's my first post here, it's great place and it helps me a lot !

So I'm using pcregrep and I want to catch files with my pattens. I'm using variable for regex like:

test="<\?php\n.*. = Array\(('.'=>'.', ){20,}.*\);\nfunction .*\(.*, .*\).*for\(.i=0; .i .*\nreturn base64_decode(.*);}\n.* ('.*'.\n){10,}.*"

and then searching for files:

find . -type f -name "*.php1" -print0 | xargs -0 pcregrep --colour=auto -M "$test"

You can find test.sh (bash file that I'm using) and *.php1 file that I want to match here: http://sendrev.com/stackoverflow/

When I run sh test.sh it always find rows to (I saw it in colour because I'm using --colour=auto):

'yzr3YEca5VacRthC6tGoXNkS2n8S2n8S2n8S2n8S2n8S4wUSFBc9FhGcqtcP6JTk4J0kj5TOMZ0yYmL'. 

(or other, but not shows end of file)

I can't match more lines from that and I can not understand why. I want to match last line with something like:

"eval\(.*(.*, .*\)\);\?>$"

or

"\)\);\?>"

but I can't goes to that line. If I add "));\?>" to end of "test" variable there's nothing found, because it can't goes to last line.

!!! Importnat is that if lines with '.*'. are much less than everything is okay. It looks there is some limitation that I can't understand.

You can test it if you have CentOS or other linux distribution.

Can you please help searching mistakes from my side ? Thanks.

Konrad Krakowiak
  • 12,285
  • 11
  • 58
  • 45
  • use `[\s\S]*?` instead of `.*` – Avinash Raj Apr 11 '15 at 02:07
  • Thank you for support Raj. I change last two inline `.*` with inline `[\s\S]*?` but result is same. – Velislav Sendrev Apr 11 '15 at 02:14
  • could you provide an example along with expected output? – Avinash Raj Apr 11 '15 at 02:15
  • My expected output is to see all content of file matched to the end, but it just stop to that line that i talk about and can't match after that with another regex. My project is here: http://sendrev.com/stackoverflow/ (you can download two files .sh and .php1). If you want to test it i can try to run VM or Raspberry Pi for you. !!! Sorry i forgot to tell, if lines with inline `'.*'.` are much less than everything is okay. It looks there is some limitation that i can't understand. – Velislav Sendrev Apr 11 '15 at 02:18
  • Your code works fine on my machine, I can find many lines beyond the one you speak of... – ShellFish Apr 11 '15 at 02:50
  • If you want to match lines with another regex, use either groups separated by pipeline symbols `|` (logical or in this context) or just call a bash function upon find's `-exec` and perform multiple greps there. – ShellFish Apr 11 '15 at 02:52
  • @ShellFish how many lines you find in your machine ? I test it on other machine and line was different, but result is that i can't goes to last line anyway. My one regex is fine, not need more than but can't understand why it stops on that line and do not show all content of file to the end. – Velislav Sendrev Apr 11 '15 at 02:53
  • I have 770 matching lines - of which said line is the 307th. – ShellFish Apr 11 '15 at 03:20
  • Ok , good, but .php1 file is 1498. The main problem is that it can't goes to last line even if `.*` is at the end of regex. – Velislav Sendrev Apr 11 '15 at 03:28

1 Answers1

0

Ok i found another solution (grep) with this type of searching without changing regex:
find . -type f -name "*.php" -print0 | xargs -0 grep --colour -Pzo "$test1"

Explanation:
-P activate perl-regexp for grep (a powerful extension of regular extensions)
-z suppress newline at the end of line, subtituting it for null character.
That is, grep knows where end of line is, but sees the input as one big line.
-o print only matching.

I think it's okay for me, but it's a mystery why it not work with pcregrep.