2

I'm working with a bunch (~2000) .csproj files, and in this development staff there's a historical precedent for embedded xcopy in the post-build events to move things around during the build process. In order to get build knowledge into once place, I'm working towards eradicating these xcopy calls in favor of declarative build actions in our automated build process.

With that in mind, I'm trying to come up with a regex I can use to chop out the path arguments supplied to xcopy. The statements come in a couple flavors:

xcopy /F /I /R /E /Y "..\..\..\Microsoft\Enterprise Library\3.1\bin"
xcopy /F /I /R /E /Y ..\Crm\*.* .\
xcopy ..\NUnit ..\..\..\output\debug /I /Y

specifically:

  • unpredictable placement of switches
  • destination path argument not always supplied
  • path arguments sometimes wrapped in quotes

I'm no regex wizard, but this is what I've got so far (the excessive use of parenteses are for match saving in powershell:

(.*x?copy.* '"?)([^ /'"]+)('"/.* '"?)([^ /'"]+)('"?.*)

the ([^ /'"]+) sections are the part that I intend to be the path arguments, being defined as strings containing no quotes, spaces, or forwards slashes, but I have a feeling I'll have to apply two regexes (one for quote-wrapped paths with spaces and one for no-quote paths)

Unfortunately, when I run this regex it seems to give me the same match for both the first and second path arguments. Most frustrating.

How would I change this to correct it?

bwerks
  • 8,651
  • 14
  • 68
  • 100
  • 1
    What do you want to match exactly? The path only, or you want like split every xcopy statement into three separate matches (xcopy, parameters, path)? – Emiliano Poggi May 13 '11 at 17:18
  • I'm mainly concerned with updating the path arguments since, they must be maintained when things move around (they're all expressed as relative paths). – bwerks May 18 '11 at 17:31

2 Answers2

2

I don't think you need two different patterns to match the paths.

The following pattern should match each single statement in all three cases you have provided:

\A(xcopy)\s+([\/A-Z\s]*)\s*((".*?")|([^\s]*))\s*((".*?")|([^\s]*))\s*([\/A-Z\s]*)

I've used or (|) to match paths in the various combinations.

NOTE Because I've not windows at the moment, I've been testing this pattern on my linux ruby but the syntax should not be different or at least should give you an idea.

Emiliano Poggi
  • 24,390
  • 8
  • 55
  • 67
2

In cases like this, I like to leverage PowerShell's argument parsing system. Use a simple regex to grab the whole xcopy line and then run it through a function.

$samples = 'xcopy /F /I /R /E /Y "..\..\..\Microsoft\Enterprise Library\3.1\bin"',
    'xcopy /F /I /R /E /Y ..\Crm\*.* .\',
    'xcopy ..\NUnit ..\..\..\output\debug /I /Y'

function argumentgrinder {
    $args | Where-Object {($_ -notlike "/*") -and ($_ -ne "xcopy")}
}

$samples | foreach { Invoke-Expression "argumentgrinder $_"}

You do have to be careful of anything that looks like a PowerShell variable in the paths though ($, @ and parentheses).

JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
  • I kind of like this. So when you pass the string to argumentgrinder, the it gets chopped up into the args array for easy consumption, thus relieving me of having to worry about quotes and spaces and ordering...very nice indeed. – bwerks May 18 '11 at 17:34