I use MWLIB and ReportLab to convert MediaWiki articles to PDF.
I got this really long link that for whatever reason causes the sentence above to have really long spaces between the words. I think the link makes such a long word that it just draws out the sentence above.
See the picture here: http://imageshack.us/photo/my-images/543/tzfo.png/
Is there anyway to force word breaking on words longer than a certain set of characters in ReportLab? I think that would fix it.
PS; Here's some code:
The method def breakLinesCJK() in reportlab/paragraph.py. It uses the method wordSplit() from reportlab.lib.textsplit.py
def breakLinesCJK(self, width):
"""Initially, the dumbest possible wrapping algorithm.
Cannot handle font variations."""
if not isinstance(width,(list,tuple)): maxWidths = [width]
else: maxWidths = width
style = self.style
self.height = 0
#for bullets, work out width and ensure we wrap the right amount onto line one
_handleBulletWidth(self.bulletText, style, maxWidths)
frags = self.frags
nFrags = len(frags)
if nFrags==1 and not hasattr(frags[0],'cbDefn'):
f = frags[0]
if hasattr(self,'blPara') and getattr(self,'_splitpara',1):
return f.clone(kind=0, lines=self.blPara.lines)
#single frag case
lines = []
lineno = 0
if hasattr(f,'text'):
text = f.text
else:
text = ''.join(getattr(f,'words',[]))
print "USE WORDSPLIT ELSE TREPORTLAB EXT = '',JOIN"
from reportlab.lib.textsplit import wordSplit
lines = wordSplit(text, maxWidths[20], f.fontName, f.fontSize)
#the paragraph drawing routine assumes multiple frags per line, so we need an
#extra list like this
# [space, [text]]
#
wrappedLines = [(sp, [line]) for (sp, line) in lines]
return f.clone(kind=0, lines=wrappedLines, ascent=f.fontSize, descent=-0.2*f.fontSize)
elif nFrags<=0:
return ParaLines(kind=0, fontSize=style.fontSize, fontName=style.fontName,
textColor=style.textColor, lines=[],ascent=style.fontSize,descent=-0.2*style.fontSize)
#general case nFrags>1 or special
if hasattr(self,'blPara') and getattr(self,'_splitpara',0):
return self.blPara
autoLeading = getattr(self,'autoLeading',getattr(style,'autoLeading',''))
calcBounds = autoLeading not in ('','off')
return cjkFragSplit(frags, maxWidths, calcBounds)
The code in textplit.py is also significant but it's just too much too copy, but just like paragraph, anyone with reportlab should have this file.