3

There's this note on the docs for TarFile.extract():

Note: The extract() method does not take care of several extraction issues. In most cases you should consider using the extractall() method.

What "extraction issues" is it referring to? Why would I want to use the extractall() method instead of extract() when I only have a single file I want to get from the tar file?

ArtOfWarfare
  • 20,617
  • 19
  • 137
  • 193
  • Well, one possible issue would be the existence of multiple copies of a file in a `tar` file. e.g. `tar cf something.tar file1; ; tar rf something.tar file1`. Now the tar file contains two files named "file1", with different contents. If you just do `Tarfile.extract()`, I believe it will only give you the first version, where `Tarfile.extractall()` would give you the most recent one, which is probably more useful. That's just one example, though - there may be other "issues"... – twalberg Apr 03 '15 at 19:43
  • @twalberg: That's an interesting scenario you bring up. I didn't realize multiple files with the exact same name could be inserted into a single tar. Your description of how Python handles it doesn't seem quite right, though. If you pass a name to `TarFile.extract()`, it extracts the last file with that name. If you use `TarFile.extractall()`, it'll extract the last file of each name. The only way I've found that you can get an earlier version from a `TarFile` is with, for example, `tar.extract(tar.getmembers()[0])`. This, to me, actually suggests `extract` is better than `extractall`. – ArtOfWarfare Apr 06 '15 at 14:16
  • Hmmm... Guess I had a bad assumption, then - never looked at the code for `Tarfile.extract()`, and I just presumed it would scan the file from the beginning and extract the first file it found that matched the name, and then short-circuit scanning the rest of the file... Still, I guess that might be one of the "issues", because there's no way to explicitly say "I want the first one" or "I want the last one" or even an in-between one, using just `extract()`... Interesting... – twalberg Apr 06 '15 at 16:45
  • @twalberg: Not quite. `extract()` allows you to pass in the name of the file (in which case you can't specify which one if there are multiple) or pass in a TarInfo object, which will uniquely identify each of the same-named files, so you can use `extract()` to get a specific file. `extractall()`, on the other hand, will only extract one file of each name (the last one, it seems.) Which is why in my prior comment I mentioned that it seems to me that `extract` is better than `extractall`. – ArtOfWarfare Apr 06 '15 at 19:30

0 Answers0