1

I'm trying to get the syllables of word using pyhyphen. If I'm using the English dictionary, the apostrophe is handled correctly in my opinion

import hyphen
h = hyphen.Hyphenator('en_US')
h.syllables(u"Hammond's")

It's just included in one syllable

[u'Ham', u"mond's"]

But if I do the same using the German dictionary

h = hyphen.Hyphenator('de_CH')
h.syllables(u"Hammond's")
h.syllables(u"Bismarck'sche")

the apostrophe is seen as if it was it's own syllable:

[u'Ham', u'mond', u"'s"]
[u'Bis', u'marck', u"'", u'sche']

I was wondering how if it was possible to define exceptions (not to break) for certain characters? Like it is possible in LaTex.

The workaround that came to my mind was just to look for a leading apostrophe in the syllables and just concatenate with the previous one:

syl = [u'Bis', u'marck', u"'", u'sche']
syls2 = []
for syl in syls:
    if syl.startswith("'"):
        if not syls2:
            syls2.append(syl)
        else:
            syls2[-1]+=syl
    else:
        syls2.append(syl)

[u'Bis', u"marck'", u'sche']

But this is not a nice or general solution and I'm interested in general, how to define hyphenation rules for words, where it is done incorrectly.

Community
  • 1
  • 1
Akzidenzgrotesk
  • 544
  • 4
  • 8
  • I think part of the problem is that those words shouldn't have apostrophes in German, right? They should just be `Hammonds` and `Bismarcksche`. If you are ok with just removing the apostrophe before hyphenation, how about `h.syllables(u"Bismarck'sche".replace("'", ""))`? – leekaiinthesky Apr 04 '15 at 11:22
  • With Hammonds I agree, this is a bad example. – Akzidenzgrotesk Apr 04 '15 at 15:43
  • @leekaiinthesky Bismarck'sche is actually an taken from [duden](http://www.duden.de/sprachwissen/rechtschreibregeln/worttrennung#K168). A more frequent used example would be the omitted vowel of "es". "Seit gestern regnet's ununterbrochen". – Akzidenzgrotesk Apr 04 '15 at 15:55

0 Answers0