0

I have the following files from which I would like to match a set of specific files

TS_1.zip
The one.zip
Linux Mirror.zip
Linux.Mirror.zip
LinuxWindows1.zip
LinuxWindows2.zip
LinuxWindows3.zip
LinuxWindows4.zip
Linux.Windows.zip
TS_1.xls
The one.txt

The regular expression I am using is;

Linux(?=Windows).*\.zip/g

I attempted to use however it does not match any patterns;

Linux(?=\bWindows).*\.zip/g

I would have thought \b matches the word 'Windows'

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
PeanutsMonkey
  • 6,919
  • 23
  • 73
  • 103
  • What is the tool that you are using? And what do you want to match? – nhahtdh May 16 '13 at 20:32
  • @nhahtdh - I am using http://regex101.com/ as a tool to see if my pattern matches. I am attempting to match LinuxWindows1-4. – PeanutsMonkey May 16 '13 at 20:34
  • Not the regex tester. I want to ask the tool. The regex tester may have features that surpass that of your tool. – nhahtdh May 16 '13 at 20:35
  • @nhahtdh - I am not using a specific tool as yet because I would like to see if my expression makes sense. If it does, I would use it on the `bash` shell – PeanutsMonkey May 16 '13 at 20:38
  • Bash regex doesn't have look-ahead. That's why I ask. `bash` regex has much less features than what you see at regex101 – nhahtdh May 16 '13 at 20:41

1 Answers1

4

Between x in Linux and W in Windows, there is no word boundary. Word boundary is defined by the transition between a word character and a non-word character. x and W are both word characters, so there is no transition point here. The regex engine is not that smart to detect that Linux and Windows are 2 different words.

The regex should be as simple as:

LinuxWindows.*\.zip

But since I don't know the tool or how you read the input, I don't know whether it would be correct or not. Depending on the tool, it might return a match for OtherTextLinuxWindows1.zip, which may not be what you want.

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
  • It's probably me not understanding word boundaries but I would have thought that `LinuxWindows` is a single word. – PeanutsMonkey May 16 '13 at 20:35
  • Could you give me simple examples of how I can use word boundaries as I can't make heads or tail of it? – PeanutsMonkey May 16 '13 at 20:41
  • @PeanutsMonkey: You don't need it in this case. That's all I can say. You can search around SO for example. – nhahtdh May 16 '13 at 20:42
  • I know I don't need it but am attempting to learn how to use it. I did have a look around but they don't make sense to me. Could we have a quick chat? – PeanutsMonkey May 16 '13 at 20:43
  • @PeanutsMonkey: I just go through some of my answer that has to do with word boundary: [Q1](http://stackoverflow.com/questions/14718748/with-word-boundaries-b-in-regex-do-i-need-to-have-it-before-and-after-the-wor) [Q2](http://stackoverflow.com/questions/12032669/parsing-string-using-the-scanner-class) [Q3](http://stackoverflow.com/questions/15401764/regex-preg-match-all-match-all-pattern) [Q4](http://stackoverflow.com/questions/15130309/how-to-use-regex-in-string-contains-method-in-java) I recommend question 1 and 4, since the examples there are quite clear. – nhahtdh May 16 '13 at 20:50