34

What is the proper way to access resources in python programs.

Basically in many of my python modules I end up writing code like that:

  DIRNAME = os.path.split(__file__)[0]

  (...) 

  template_file = os.path.join(DIRNAME, "template.foo")

Which is OK but:

  • It will break if I will start to use python zip packages
  • It is boilerplate code

In Java I had a function that did exactly the same --- but worked both when code was lying in bunch of folders and when it was packaged in .jar file.

Is there such function in Python, or is there any other pattern that I might use.

Brian Lyttle
  • 14,558
  • 15
  • 68
  • 104
jb.
  • 23,300
  • 18
  • 98
  • 136

4 Answers4

21

You'll want to look at using either get_data in the stdlib or pkg_resources from setuptools/distribute. Which one you use probably depends on whether you're already using distribute to package your code as an egg.

David C
  • 7,204
  • 5
  • 46
  • 65
stderr
  • 8,567
  • 1
  • 34
  • 50
  • OK this is a pythonic answer. get_data looks interesting, but pkg_resources is way to intimidating, but I'll definietely look into it when I'll start using distutils to package my project. – jb. Jun 07 '12 at 17:47
  • I'm looking for similar solution. `get_data` is great but I need to get the file-like object for this file, not the contents of the file directly. Is there an elegant way? – zegkljan Dec 08 '14 at 09:55
  • 3
    @zegkljan The most pythonic way is to wrap it with BytesIO (StringIO in Py2): `file_like = BytesIO(get_data(__package__, 'filename.dat'))` – cincodenada Mar 06 '16 at 04:54
  • 2
    Nowadays, the proper way to access resources is to use the `importlib.resources` module. See my [answer](https://stackoverflow.com/a/73497763/1513933) below. – Laurent LAPORTE Aug 26 '22 at 07:54
8

Since version 3.7 of Python, the proper way to access a file in resources is to use the importlib.resources library.

One can, for example, use the path function to access a particular file in a Python package:

import importlib.resources

with importlib.resources.path("your.package.templates", "template.foo") as template_file:
    ...

Starting with Python 3.9, this package introduced the files() API, to be preferred over the legacy API.

One can, use the files function to access a particular file in a Python package:

template_res = importlib.resources.files("your.package.templates").joinpath("template.foo")
with importlib.resources.as_file(template_res) as template_file:
    ...

For older versions, I recommend to install and use the importlib-resources library. The documentation also explains in detail how to migrate your old implementation using pkg_resources to importlib-resources.

Laurent LAPORTE
  • 21,958
  • 6
  • 58
  • 103
2

Trying to understand how we could combine the two aspect togather

  1. Loading for resources in native filesystem
  2. Packaged in zipped files

Reading through the quick tutorial on zipimport : http://www.doughellmann.com/PyMOTW/zipimport/

I see the following example:

import sys
sys.path.insert(0, 'zipimport_example.zip')
import os
import zipimport
importer = zipimport.zipimporter('zipimport_example.zip')
module = importer.load_module('example_package')
print module.__file__
print module.__loader__.get_data('example_package/README.txt')

I think that output of __file__ is "zipimport_example.zip/example_package/__init__.pyc"

Need to check how it looks from inside.

But then we could always do something like this:

if ".zip" in example_package.__file__:
    ... 
    load using get_data
else:
    load by building the correct file path

[Edit:] I have tried to work out the example a bit better.

If the the package gets imported as zipped file then, two things happen

  1. __file__ contains ".zip" in it's path.
  2. __loader__ is available in the name space

If these two conditions are met then within the package you could do:

print __loader__.get_data(os.path.join('package_name','README.txt'))

else the module was loaded normally and you can follow the regular approach to loading the file.

pyfunc
  • 65,343
  • 15
  • 148
  • 136
  • you should use `os.path.join('example_package', 'README.txt')` if you want to be platform independent – kratenko Jun 07 '12 at 16:15
  • @jb: Does this solution looks better to you. – pyfunc Jun 07 '12 at 17:04
  • Definetely interesting answer --- i'll most probably stick to distutils if just pushing python code would be to cumbersome at some point. But I think that this could be used in GAE code. – jb. Jun 07 '12 at 17:54
0

I guess the zipimport standard python module could be an answer...

EDIT: well, not the use of the module directly, but using sys.path as shown in the example could be a good way:

  • I have a zip file test.zip with one python module test and a file test.foo inside
  • to test that for the zipped python module test can be aware of of test.foo, it contains this code:

c

import os
DIRNAME = os.path.dirname(__file__)
if os.path.exists(os.path.join(DIRNAME, 'test.foo')):
    print 'OK'
else:
    print 'KO'

Test looks ok:

>>> import sys
>>> sys.path.insert(0, r'D:\DATA\FP12210\My Documents\Outils\SVN\05_impl\2_tools\test.zip')
>>> import test
OK
>>> 

So a solution could be to loop in your zip file to retrieve all python modules, and add them in sys.path; this piece of code would be ideally the 1st one loaded by your application.

Emmanuel
  • 13,935
  • 12
  • 50
  • 72
  • 3
    I disagree --- zipimport can't import anything other that `.py[co]?` files. So as I stated in the question --- this code will fail if someone tried to run it from an archive. – jb. Jun 07 '12 at 15:52
  • Indeed, but the `sys.path.insert(0, '/tmp/example.zip')` gives a good idea of how to do... – Emmanuel Jun 07 '12 at 16:03