22

I need to extract messages from .po files. Is there a Python module to do that? I wrote a parser, but it depends on platform (\r\n vs. \n).

Is there a better way to do this?

Aidan Fitzpatrick
  • 1,950
  • 1
  • 21
  • 26
alex
  • 521
  • 1
  • 5
  • 17

3 Answers3

39

In most cases you don't need to parse .po files yourself. Developers give translators a .pot template file, they rename it to xx_XX.po and translate the strings. Then you as developer only have to "compile" them to .mo files using GNU's gettext tools (or its Python implementation, pygettext)

But, if you want/need to parse the po files yourself, instead of compiling them, I strongly suggest you to use polib, a well-known python library to handle po files. It is used by several large-scale projects, such as Mercurial and Ubuntu's Launchpad translation engine:

PyPi package home: http://pypi.python.org/pypi/polib/

Code repository: https://github.com/izimobil/polib

(Original repository was hosted at Bitbucket, which no longer supports Mercurial: https://bitbucket.org/izi/polib/wiki/Home)

Documentation: http://polib.readthedocs.org

The import module is a single file, with MIT license, so you can easily incorporate it in your code like this:

import polib
po = polib.pofile('path/to/catalog.po')
for entry in po:
    print entry.msgid, entry.msgstr

It can't be easier than that ;)

MestreLion
  • 12,698
  • 8
  • 66
  • 57
  • 2
    Looks like `polib` is unmaintained: last release in 2017 and the bitbucket mercurial repo is down. – Boris Verkhovskiy Aug 19 '20 at 08:30
  • 2
    @Boris: that's very unfortunate, it has always been an amazing project. About the repo, it is down because BitBucket no longer support Mercurial repositories, but at least the owner seems to have set a git repository at github: https://github.com/izimobil/polib – MestreLion Aug 19 '20 at 14:45
  • 3
    Good news: after a long 3-year hiatus, seems the project is alive again! Author is merging some pull requests and triaging bug reports. So far far just a couple commits in 2020, but at least it's not _abandoned_. – MestreLion Feb 13 '21 at 21:32
3

Babel includes a .po files parser written in Python:

http://babel.edgewall.org/

The built-in gettext module works only with binary .mo files.

yak
  • 8,851
  • 2
  • 29
  • 23
-4

Use builtin gettext module: http://docs.python.org/library/gettext.html

It appeared for me as the first search result in Google after providing python gettext. If you wondered if this is what you searched for, then yes, it is.

Tadeck
  • 132,510
  • 28
  • 152
  • 198