29

I'm trying to figure out how to copy CAD drawings (".dwg", ".dxf) from a source directory with subfolders to a destination directory and maintaining the original directory and subfolders structure.

  • Original Directory: H:\Tanzania...\Bagamoyo_Single_line.dwg
  • Source Directory: H:\CAD\Tanzania...\Bagamoyo_Single_line.dwg

I found the following answer from @martineau within the following post: Python Factory Function

from fnmatch import fnmatch, filter
from os.path import isdir, join
from shutil import copytree

def include_patterns(*patterns):
    """Factory function that can be used with copytree() ignore parameter.

    Arguments define a sequence of glob-style patterns
    that are used to specify what files to NOT ignore.
    Creates and returns a function that determines this for each directory
    in the file hierarchy rooted at the source directory when used with
    shutil.copytree().
    """
    def _ignore_patterns(path, names):
        keep = set(name for pattern in patterns
                            for name in filter(names, pattern))
        ignore = set(name for name in names
                        if name not in keep and not isdir(join(path, name)))
        return ignore
    return _ignore_patterns

# sample usage

copytree(src_directory, dst_directory,
         ignore=include_patterns('*.dwg', '*.dxf'))

Updated: 18:21. The following code works as expected, except that I'd like to ignore folders that don't contain any include_patterns('.dwg', '.dxf')

kaya3
  • 47,440
  • 4
  • 68
  • 97
Peter Wilson
  • 1,075
  • 3
  • 14
  • 24
  • 1
    That code is already demonstrating how to do it. You pass the patterns to `include_patterns`, and the return is a callback that you pass to `copytree`. `copytree` does the work of passing paths and names to the resulting `_ignore_patterns` function as it traverses the tree. – ShadowRanger Feb 27 '17 at 14:25
  • 1
    Hi @ShadowRanger I now understand how the following works. I need to amend the following only to copy the tree if there is a match based on my include_patterns so that I don't end up with empty directories. – Peter Wilson Feb 27 '17 at 15:35

1 Answers1

56

shutil already contains a function ignore_patterns, so you don't have to provide your own. Straight from the documentation:

from shutil import copytree, ignore_patterns

copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

This will copy everything except .pyc files and files or directories whose name starts with tmp.

It's a bit tricky (and not strictly necessary) to explain what's going on: ignore_patterns returns a function _ignore_patterns as its return value, this function gets stuffed into copytree as a parameter, and copytree calls this function as needed, so you don't have to know or care how to call this function _ignore_patterns. It just means that you can exclude certain unneeded cruft files (like *.pyc) from being copied. The fact that the name of the function _ignore_patterns starts with an underscore is a hint that this function is an implementation detail you may ignore.

copytree expects that the folder destination doesn't exist yet. It is not a problem that this folder and its subfolders come into existence once copytree starts to work, copytree knows how to handle that.

Now include_patterns is written to do the opposite: ignore everything that's not explicitly included. But it works the same way: you just call it, it returns a function under the hood, and coptytree knows what to do with that function:

copytree(source, destination, ignore=include_patterns('*.dwg', '*.dxf'))
Sam Krygsheld
  • 2,060
  • 1
  • 11
  • 18
Jan
  • 892
  • 6
  • 7
  • Hi @Jan , the following function is to generate a dynamic ignore list based on files that I want to keep i.e. CAD ("*.dwg", "*.dxf"), so all other file types are then ignored. I've got the following working, my last hurdle is to exclude folders that have no files within them based on the include_patterns("*.dwg", "*.dxf"). – Peter Wilson Feb 27 '17 at 15:47
  • Where is the include_patterns method defined? – Alan Kavanagh May 17 '17 at 11:52
  • 1
    @AK47 include_patterns is defined in the OP. – Jan May 18 '17 at 12:14
  • Is there a way to combine both? – Alex Weitz Jul 14 '20 at 03:06
  • 1
    Let's say I have a list (i.e. `mylist=[,*.txt, '*.o']`). How do I pass this list as an argument to `ignore_patterns()`? `ignore_patterns(mylist)` doesn't seem to work here. – KcFnMi Aug 24 '20 at 23:42
  • 1
    @KcFnMi Did you find a solution? – Nickleby Jan 27 '21 at 10:53
  • I don't remember, check https://stackoverflow.com/search?q=user:5082463+[python] – KcFnMi Jan 27 '21 at 12:45
  • 2
    @KcFnMi, you need to unpack the values from the list. use: `ignore_patterns(*mylist)`. i tested it and it worked. – alexzander Feb 19 '21 at 13:01