0

so I am trying to form a PCRE regex in php, specifically for use with preg_replace, that will match any number of characters that make up a text(.txt) file name, from this I will derive the directory of the file.

my initial approach was to define the terminating .txt string, then attempt to specify a character match on every character except for the / or \, so I ended up with something like:

'/[^\\\\/]*\.txt$/'

but this didn't seem to work at all, I assume it might be interpreting the negation as the demorgan's form aka: (A+B)' <=> A'B'

but after attempting this test:

'/[^\\\\]\|[^/]*\.txt$/'

I came to the same result, which made me think that I shouldn't escape the or operator(|), but this also failed to match. Anyone know what I'm doing wrong?

Gilles 'SO- stop being evil'
  • 104,111
  • 38
  • 209
  • 254
xenador
  • 211
  • 4
  • 15

2 Answers2

2

The foloowing regular expression should work for getting the filename of .txt files:

$regex = "#.*[\\\\/](.*?\.txt)$#";

How it works:

  • .* is greedy and thus forces match to be as far to the right as possible.
  • [\\\\/] ensures that we have a \ or / in front of the filename.
  • (.*?\.txt) uses non-greedy matching to ensure that the filename is as small as possible, followed by .txt, capturing it into group 1.
  • $ forces match to be at end of string.
Sebastian Paaske Tørholm
  • 49,493
  • 11
  • 100
  • 118
  • hmm, I seem to be getting false as a result. I suspect it might have to do with the same reason mine was failing. I'm thinking the backslashes might be doing something to the way the grouping and the implicit or works. – xenador Jan 16 '11 at 09:42
  • What's the string it's failing on? – Sebastian Paaske Tørholm Jan 16 '11 at 09:44
  • C:\xampp\htdocs\windowz\logs\alkjfaldkjfldskjf.txt (some letters have been replaced, by similar letters in random ordering...no special characters exist in the replaced chars) – xenador Jan 16 '11 at 09:50
  • It matches the filename fine for me. My test script: http://pastie.org/1466074 , the output I get: http://pastie.org/1466077 – Sebastian Paaske Tørholm Jan 16 '11 at 09:57
  • ah ok I see whats happening, preg_match does return the result of the filename but I should've clarified my question a bit. so I want the directory that a text file resides in, my approach was to find what was the filename, then replace it via: preg_replace('#.*[\\\\/](.*?\.txt)$#', '', $file) to leave only the directory. When I use the regex expression you provided I get an empty string back from preg_replace – xenador Jan 16 '11 at 10:09
0

Try this pattern '/\b(?P<files>[\w-.]+\.txt)\b/mi'

$PATTERN = '/\b(?P<files>[\w-.]+\.txt)\b/mi';
$subject = 'foo.bar.txt plop foo.bar.txtbaz foo.txt';
preg_match_all($PATTERN, $subject, $matches);
var_dump($matches["files"]);
Xavier Barbosa
  • 3,919
  • 1
  • 20
  • 18