0

I am converting email to a .pdf using an HTML-to-pdf conversion scheme.

When I convert the email I see this in the pdf:

enter image description here

So I looked a little deeper into the email and can see Unicode character 200e which is a left to right character:

enter image description here

I am going to strip that character out of the email before the conversion, but is there a better solution?

[EDIT] Thank you for correcting the typo where I misstated the direction.

This email is from a U.S. based English configured user's computer. The date is inserted when he hits reply on an email and it inserts the marker for separating the conversations. Due to the nature of the business, it is highly unlikely they will get right to left languages. The 0x200e only appears in the date.

This is a programming question because I am converting the HTML email to pdf using c# in an outlook add-on that we are creating.

We are using HtmlRenderer.PdfSharp to do the conversion; seems to work very well other than this annoyance.

bugfreerammohan
  • 1,471
  • 1
  • 7
  • 22
Be Kind To New Users
  • 9,672
  • 13
  • 78
  • 125
  • U+200E is a "left to right" character, not vv. (correct in the title but in your question it says "right to left" - which confused me). Since you haven't mentioned what language/library you are using for this, I'm not sure how this is a programming question? You could be saving your e-mail as PDF from your mail reader. It's very likely dependent on the PDF converter, what it does with U+200E. It would be more correct not to show any visible output for it. – Erwin Bolwidt Jan 29 '19 at 01:28
  • 1
    If you're seeing direction marks such as this, does it mean there is the possibility of encountering right-to-left text? If so, then discarding the direction marks would cause it to be displayed incorrectly. Does the PDF converter claim to handle bidirectional text? Does it offer options about not displaying "non-graphic" codepoints? –  Jan 29 '19 at 01:32
  • 1
    @another-dave there is no right-to-left text, only left-to-right. OP made a typo in the body of the question but U+200E is left-to-right, not right-to-left. – Erwin Bolwidt Jan 29 '19 at 01:35
  • I know there was a typo, and I had already edited the question to make it correct (edit is awaiting review). But that is not my point. My point is that if he is dealing with an email system that inserts left-to-right marks, it seems reasonable to infer that the email system supports bidirectional text, in which case there is the possibility that 'tomorrow' there will be some right-to-left text (not just right-to-left marks) which means in turn the conversion to PDF can't just throw things away. –  Jan 29 '19 at 01:49
  • Without knowing what library/system is being used it's hard to tell. It may or may not support bidirectional text. Voting to close as "Unclear what you're asking" for now because this question is not answerable without further details. – Erwin Bolwidt Jan 29 '19 at 02:21

0 Answers0