3

I've written the following batch file to create multiple files using FOR loop:

@echo off  
cls  
FOR /L %%i IN (1 1 10) DO (  
    echo.> file%%i.txt  
    IF ERRORLEVEL 0 echo Successfully created file 'file%%i.txt'.  
)  
dir /b *.txt  
FOR %%i IN (*.txt) DO (  
    echo.> file%%i.txt
    IF ERRORLEVEL 0 echo Successfully created file 'file%%i.txt'.  
)

Here, 10 files (viz. file1.txt .... file10.txt) are created in the first FOR loop.
And in the second FOR loop I've used these files to frame the name of next new files. (viz.filefile1.txt.txt ... filefile10.txt.txt)

But, an extra file is being created : filefilefile1.txt.txt.txt
What logical issue is causing the creation of this extra file ?

Satyendra
  • 1,635
  • 3
  • 19
  • 33
  • What does it print when you do `FOR %%i IN (*.txt) DO ( echo %%i ) `? Just 10 entries or 11? – adarshr Oct 31 '13 at 11:01
  • If I simply do `FOR %%i IN (*.txt) DO (echo %%i)`, of course, I get the list of files just created in first loop; which is same as what I received in response of `dir *.txt`. – Satyendra Oct 31 '13 at 11:05

3 Answers3

5

EDITED - Seems i've not explained it properly and people doesn't see how it works. My fault. I'm going to try to explain it better.

The reason is the way the for command works internally.

When the line for var in (files) is reached, the directory is checked to see if any files match and need to be processed.

Then, for command (cmd really), issues a directory query to enumerate the files. This query returns only the first file in set. If there are any aditional files that match the file mask in for command, a flag is set that indicates to the caller (cmd) that there are more files to be processed, but the list of the remaining files is not yet retrieved.

When execution of code inside for reachs the end of an iteration, and there are files pending to be read, a query is sent to get the remaining of the list of files pending to be processed and that match the for file selection.

System fills a buffer with the list of files remaining. If at that point the list of files is short enough to be full readed in buffer, the query will not be repeated. If the list of files is big enough to not fit in buffer, partial list is retrieved and when files in retrieved list are processed, the query will be sent again to get more files to process.

Number of files in buffer depends on length of filename. Shorter filenames, more files in buffer and less queries to filesystem.

This behaviour (retrieving remaining list of files at end of first file processing) is only done if the query for files returns that there are files pending. When one query does not return that flag, no more files are retrieved.

EXCEPTIONS

If working in NTFS, the files only get included in "requeries" if they are alphabetically greater than the last file that has been processed in for command.

If working if FAT, the query will include all new file generated that match the for command file selection independly of it name. And yes, it can get into an infinite loop. (In tests, the system buffer only retrieve one filename, and requery in each iteration). You can try

break > a.txt && cmd /v:on /c "for %f in (*.txt) do break > !random!.txt"

All my tests has been made on a windows 7 64bit, NTFS and FAT32 partition (this on a USB drive). No way to test other configurations. If anyone see a diferent behaviour, please comment.

For more information, ZwQueryDirectoryFile

MC ND
  • 69,615
  • 8
  • 84
  • 126
  • 1
    I seem to read in your explanation that only one file will be multi-processed but I am fairly sure that more than one file has been multi-processed in my tests in the past. Do I misunderstand your text? – foxidrive Oct 31 '13 at 12:15
  • No, no file is multiprocessed. If, while in first loop of for command new files are generated and these files matchs the set expression in for, then these file will be included, as the full list to process is not retrieved until finalization of first loop and in this moment there are new files that match the set expression. – MC ND Oct 31 '13 at 12:22
  • What I understand is that in the **starting of for loop** the files are enumerated: but the _first file_ and has-more-files _flag_ is returned. Later, at the **end of for loop** again pending files are retrived: at this moment a set of new files are already created. Then why doesn't it consider all these newly created files also? This will finally get into an _infinite loop_ of creating files !! – Satyendra Nov 04 '13 at 06:41
  • When `for` command starts, a directory query is made to see if there are files to be processed and the first file name is retrieved. The block of instructions inside `for` runs for the first file. Once the processing of first file has ended, and only then, only once, and only if initialy there was more than one file to be processed, a query is made to retrieve the remaining of the file list. If while processing of first file, files are generated that matches `for` file selection, those files are included in list and processed. – MC ND Nov 04 '13 at 07:15
  • 1
    AND if working in NTFS, the query to retrieve the remaining of the file list (and remember the file list is retrieved only once, only after first file is processed, only if initially there were more than one file) only will see new files if they are alphabetically greater than the first file processed by `for`. – MC ND Nov 04 '13 at 07:19
  • If you create a large number of .txt files and then run a for-in-do that adds text to each filename, more than one file will be processed twice. You seem to be saying that the filelist is set in stone after the first loop, but this can't be so if more than one file is processed more than once. Add @foxidrive to your reply please so I get notified. – foxidrive Nov 04 '13 at 12:07
  • @foxidrive, I had already thought about it, and where testing on NTFS, and FAT to complete the answer. You are faster than me :-) , answer edited to reflect it. Thank you. – MC ND Nov 04 '13 at 13:14
  • @Satyendra, please, see updated answer. But yes, you are right. Infinite loop is possible. – MC ND Nov 04 '13 at 13:22
  • @MCND Ya, its clear to me now, that how the `for` works internally. But, as you told that the buffer holds **remaining** files to be iterated. So I iterated the first loop (to create files) 3 times and then 300 times. Both the times I observed that the number of extra files being formed are much different!! Thanks for the description about filesystem, its quite descriptive. – Satyendra Nov 04 '13 at 14:02
3

I don't know why, but when you write ... IN (*.txt) ... in the second for loop, it is trying to find files that are just created within the body of the loop.

To eliminate that, I would make my filter a bit more specific.

FOR %%i IN (file??.txt) DO (

I ran this and it creates only 20 files as expected.

adarshr
  • 61,315
  • 23
  • 138
  • 167
3

Like adarshr said, the second FOR loop can find even the new created files.
You can avoid this by using FOR/F with a command, as the result of the dir is completely fetched before the body of the loop is executed.

...
FOR /F "delims=" %%i IN ('dir /b *.txt') DO (  
    echo.> file%%i.txt
    IF ERRORLEVEL 0 echo Successfully created file 'file%%i.txt'.  
)
jeb
  • 78,592
  • 17
  • 171
  • 225
  • The `/F "delims="` worked good, but the _mystery_ of the extra file is yet **unsolved**! – Satyendra Oct 31 '13 at 11:42
  • @Satyendra No, like adarshr said, you create files which can also be found by your search pattern. If you change the extension in the second loop it wouldn't cause any problems – jeb Oct 31 '13 at 11:44