Please note - parsing XML with batch is a risky business because XML generally ignores white space. Any script you write could probably be broken by simply reformatting the XML into another equivalent valid form. That being said...
I haven't traced the problem through to fully explain your observed behavior, but the unbalanced quote is causing a problem with this line:
(for /f "delims=<AAA></AAA> tokens=2" %%a in ('echo "%%z" ^| Findstr /r "<AAA>"') do (
You can eliminate that problem and get your code to sort of work by eliminating any quotes before-hand.
@echo off
setlocal enabledelayedexpansion
del result.txt
for /f "delims=;" %%f in ('dir /b D:\depart\*.xml') do (
for /f "usebackq delims=;" %%z in ("D:\depart\%%f") do (
set code=%%z
set code=!code:"=!
set code=!code: =!
(for /f "delims=<AAA></AAA> tokens=2" %%a in ('echo "!code!" ^| Findstr /r "<AAA>"') do (
echo %%a
)) >> result.txt
)
)
But you have a potential major problem. DELIMS does not specify a string - it specifies a list of characters. So your DELIMS=<AAA></AAA>
is equivalent to DELIMS=<>/A
. If your element value ever has an A or / in it, then your code will fail.
There is a much better way:
First off, you can use FINDSTR to collect all your <AAA>----</AAA>
lines from all files in one pass, without any loop:
findstr /r "<AAA>.*</AAA>" "D:\depart\*.xml"
Each matching line will be output as the file path, followed by a colon, followed by the matching line, as in:
D:\depart\file_1.xml:<AAA>C086002-T1111</AAA>
The file path can never contain <
, or >
, so you can use the following to iterate the result, capturing the appropriate token:
for /f "delims=<> tokens=3" %%A in ( ...
Finally, you can put parentheses around the entire loop, and redirect just once. I'm assuming you want each run to create a new file, so I use >
instead of >>
.
@echo off
setlocal enabledelayedexpansion
>result.txt (
for /f "delims=<> tokens=3" %%A in (
'findstr /r "<AAA>.*</AAA>" "D:\depart\*.xml"''
) do (
set code=%%A
set code=!code:"=!
set code=!code: =!
echo(!code!
)
Assuming that you only need to trim leading or trailing spaces/quotes, then the solution is even simpler. It does require odd syntax to specify a quote as a DELIM character. Note that there are two spaces between the last ^
and %%B
. The first escaped space is taken as a DELIM character. The unescaped space terminates the FOR /F options string.
@echo off
>result.txt (
for /f "delims=<> tokens=3" %%A in (
'findstr /r "<AAA>.*</AAA>" "D:\depart\*.xml"'
) do for /f delims^=^"^ %%B in ("%%A") do echo(%%B
)
UPDATE in response to comment
I'm assuming your data value will never contain a colon.
If you want to append source file name to each line of output, then you simply need to alter the first FOR /F to capture the first token (the source file) as well as the third token (the data value). The file will contain the full path as well as a trailing colon. The second FOR /F appends the file to the source data string using the ~nx
modifier to get just the name and extension (no drive or path), and a colon is added to the DELIMS option so the trailing colon is trimmed off.
@echo off
>result.txt (
for /f "delims=<> tokens=1,3" %%A in (
'findstr /r "<AAA>.*</AAA>" "D:\depart\*.xml"'
) do for /f delims^=:^"^ %%C in ("%%B;%%~nxA") do echo %%C
)