0

xgettext is capable of extracting strings for translation from a variety of source languages.

   -L, --language=NAME
          recognise  the  specified  language (C, C++, ObjectiveC, PO,
          Shell, Python, Lisp, EmacsLisp, librep, Scheme, Smalltalk,
          Java, JavaProperties, C#, awk, YCP, Tcl, Perl, PHP, GCC-source,
          NXStringTable, RST, Glade, Lua, JavaScript, Vala, Desktop)

(your exact list may vary by platform)

It also guesses the type based on the file extension, so:

$ xgettext -o out.pot in.php

will use the PHP parser without needing -L PHP.

However, I wish to translate files that aren't in any of those languages. Is it possible to submit a list of strings into xgettext directly? Or to teach it a new language?

For example, consider some Handlebars templates using a custom helper function __, like so:

<title>{{__ 'My Website'}}</title>

It's possible to extract all the strings from the files using grep:

$ grep '\{\{__ (.+?)\}\}' -Ero views
views/index.hbs:{{__ 'My Website'}}

But is there any way of feeding this information into xgettext to produce a valid pot file?

Note: while I'd appreciate a solution to this specific case, the question is really about the general case of an unknown language.

Marcus Downing
  • 10,054
  • 10
  • 63
  • 85
  • I think the short answer about teaching it a new language, is you can't. However, I can parse your example using `xgettext -L JavaScript --extract-all` – Tim Jun 26 '18 at 15:03
  • That doesn't work particularly well, because `--extract-all` extracts ALL the quoted strings in the file. For example, for the source `
  • {{__ "How to"}}
  • ` it extracts the strings `"#howto-tab"`, `"selected"` and `"How to"`. Without that option, all it produces is errors. – Marcus Downing Jun 28 '18 at 22:10
  • For the time being I have a messy scripted solution that produces tolerably compatible pot files. – Marcus Downing Jun 28 '18 at 22:12