I need some regular expression to split PO (language translation file) file's translated count , fuzzy count and total string count.
I used PHP for the program, I search every where but couldn't found.
please help me.
gettext PO files are so old and ubiquitous, they're a de facto industry standard with great support by a wide variety of tools. Trying to reinvent a solution using regexen here seems very inappropriate when you could be using one of the many PO file parsers instead. For example oscarotero/Gettext:
$translations = Gettext\Extractors\Po::extract('messages.po');
$total = $translated = $fuzzy = 0;
foreach ($translations as $translation) {
$total++;
if (!$translation->hasTranslation()) {
$untranslated++;
}
if (in_array('fuzzy', $translation->getComments())) {
$fuzzy++;
}
}
(Untested, but should work immediately or with slight changes.)
In fact though, there are tools to do this already: Translate Toolkit or Pology, for those I know of:
$ pocount locale/ko/LC_MESSAGES/
data/locale/ko/LC_MESSAGES/messages.po
type strings words (source) words (translation)
translated: 3 ( 0%) 7 ( 0%) 28
fuzzy: 0 ( 0%) 0 ( 0%) n/a
untranslated: 729 ( 99%) 1065 ( 99%) n/a
Total: 732 1072 28
unreviewed: 3 ( 0%) 7 ( 0%) 28
empty: 729 ( 99%) 1065 ( 99%) 0
$ posieve stats locale/ko/
- msg msg/tot w-or w/tot-or w-tr ch-or ch-tr
translated 3 0.4% 15 0.9% 26 93 114
fuzzy 0 0.0% 0 0.0% 0 0 0
untranslated 729 99.6% 1708 99.1% 0 17323 0
total 732 - 1723 - 26 17416 114
obsolete 0 - 0 - 0 0 0
Try this regex,
$total = array();
$translated = array();
$extra ='';
// If fuzzy true then translated count = fuzzy count
if($fuzzy) {
$extra = '#, fuzzy\n';
}
$matched = preg_match_all('/'.$extra.'msgid\s+((?:".*(?<!\\\\)"\s*)+)\s+'.'msgstr\s+((?:".*(?<!\\\\)"\s*)+)/', $po_content, $matches);
for ($i = 0; $i < $matched; $i++) {
if(trim(substr(rtrim($matches[1][$i]), 1, -1))!="") {
$total[] = substr(rtrim($matches[1][$i]), 1, -1);
}
if(trim(substr(rtrim($matches[2][$i]), 1, -1))!="") {
if (strpos(substr(rtrim($matches[2][$i]), 1, -1), 'Language-Team')===false && strpos(substr(rtrim($matches[2][$i]), 1, -1), 'MIME-Version')===false ) {
$translated[] = substr(rtrim($matches[2][$i]), 1, -1);
}
}
}
Total count = count($total); Translated count = count($translated);