1

I have a somewhat complicated problem. I've downloaded an archived website from archive.org using Httrack and now I have thousands of subfolders and files I need to merge before I can rebuild it.

I'm trying to write a batch file to solve the problem. But my search results never come close to what I'm trying to achieve.

I'm trying to make these:

D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110202194232\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110202194331im_\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110202194449cs_\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110202194453im_\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110202194505cs_\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110101000000_\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110101072153\http_\www.site.com\*
D:\Utilities\httrack\SITES\RW\web.archive.org\web\20110201061410\http_\www.site.com\*

Into this:

D:\Utilities\httrack\SITES\RW\web.archive.org\web\http_\www.site.com\*

Basically trying to move "http_" into its grand-parent directory("web"), with it's subfolders and files. As if I were dragging and dropping, clicking "Yes" to Merge Folders, and clicking "Move, but keep both files".

I'd also like it to rename any files with the same name to avoid deletion.

IE:

web\http_\www.site.com\index.html
web\http_\www.site.com\index (1).html
web\http_\www.site.com\index (2).html

Thanks in advance for your help!!!

  • If my answer below was helpful, please consider marking it as accepted. [See this page](http://meta.stackexchange.com/questions/5234/) for an explanation of why this is important. Also, for future questions, we generally expect the asker to post example code, or at least explain what has been tried -- just some indication of the effort you've put into solving the problem on your own. We're not a code writing service, and don't often provide a final solution (except in this case where the challenge was somewhat interesting). We prefer teaching and helping over "write this code for me." – rojo Jun 17 '15 at 12:20
  • @rojo Sorry about that, I wasn't trying to dump coding work on you. I thought the solution would be a few simple commands. I'd done a lot research on the problem and found few if any results relevant, but it's difficult to show that here. I didn't want to experiment with batch commands I knew nothing about. My knowledge on CMD/Batch/Powershell, is just woefully inadequate for an endeavor like this. Thank you for your solution, and your advice! – EagleGuides Jun 17 '15 at 16:36

1 Answers1

0

Challenge: accepted. Wouldn't it be nice if this functionality were built into robocopy, xcopy, fso.CopyFile, PowerShell's Move-Item, or any other utility or scripting object method?

You probably ought to test this on a copy of the hierarchy. I did some minimal testing and it seemed to work as intended, but it will be destructive if there are unforeseen problems.

@echo off
setlocal

set "root=D:\Utilities\httrack\SITES\RW\web.archive.org\web\"

for /d %%I in ("%root%\2*") do (
    set /P "=Moving %%~nxI... "<NUL
    pushd "%%~fI"
    for /r %%J in (*) do (
        set "relative=%%~dpJ"
        setlocal enabledelayedexpansion
        call :mv "%%~fJ" "%root%!relative:%%~fI\=!"
        endlocal
    )
    popd
    rd /q /s "%%~fI"
    echo Complete.
)

goto :EOF

:mv <srcfile> <destdir>
setlocal disabledelayedexpansion
if not exist "%~f2" md "%~f2"
set /a seq = 1
set "filename=%~nx1"
:mv_loop
if exist "%~f2\%filename%" (
    set "filename=%~n1 (%seq%)%~x1"
    set /a seq += 1
    goto mv_loop
)
move "%~f1" "%~f2\%filename%" >NUL
endlocal & goto :EOF
rojo
  • 24,000
  • 5
  • 55
  • 101
  • Thank you so much for your hard work! This is extremely helpful. This is something that will benefit me and others well into the future! I can hardly believe how complicated it was. My recent test revealed 2 very minor issues. It names the folder "webhttp_" instead of just "http_". Also puts it above the "web" folder. "D:\Utilities\httrack\SITES\RW\web.archive.org\webhttp_" should be "D:\Utilities\httrack\SITES\RW\web.archive.org\web\http_" . Very minor glitch, if it isn't easy to fix don't worry with it. It still solves my problem, so THANK YOU SO MUCH FOR YOU TIME! – EagleGuides Jun 17 '15 at 16:10
  • DERP. My bad. During my tests, when I changed "root=D:\Utilities\httrack\SITES\RW\web.archive.org\web\" to "root=D:\TEST\httrack\SITES\RW\web.archive.org\web" I left out the "\" backslash at the end....... This is why I shouldn't try to make batch files.lol.. Anyways I just tested it again (WITH THE "\"), and it works perfectly. You did an excellent job. I've learned a lot from this. Thank you again sir! – EagleGuides Jun 18 '15 at 01:32