UTF8 makes an extra line on my site

Question

I am writing an arabic website and saving templates as UTF8 (using notepad++) , this makes an extra new line on each file/template include , is there anyway to fix this problem without the need to save the file in ascii format ?

thank you .

Do you mean an extra line in the source code or an extra line in the rendered page? — , Jan 31 '11 at 15:04

score 6 · Accepted Answer · answered Jan 31 '11 at 15:07

6

I think you could try to open the files and then select "UTF-8 without BOM" as encoding and then save. That might explain the extra lines.

answered Jan 31 '11 at 15:07

Zsub

1,799
2
15
28

score 0 · Answer 2 · answered Sep 22 '13 at 14:47

First of all you must have the following information in hands:

Your editor eol character (that stands for 'end of line')
Your editor charset (dreamweaver, frontpage, eclipse, etc)
Your input charset (client browser)
Your server charset (back-end programming language: php, java, etc...)
Your output charset (response related destined to the client browser)

Usually the EOL Character is the "\n" meta character that means "new line". This goes fine for almost any unix based system that'll also return the column pointer to the start of the line. Windows based systems don't have this second behavior, adding a position to the line pointer, but the column pointer will remain the same. So, to fix this issue, Windows based systems use an additional "\r" meta character just before the "\n" one. "\r" stands for "carriage return" and is referred to ancient technology. It's function is to reset the column pointer to the start of the line.

In binary processing instructions the "You should buy a byke.\rYou shalt" string should be translated to "You shalt buy a byke." since at the "\r" column position the interpreter will reset the column pointer to the start of the line and will start to overwrite by the bit sequence just after (the "\r" meta character).

So in Windows based systems you should use "\r\n" for carriage return and line feed. This will behave fine on Unix based systems also.

Beware on transfers... Check your origin and destination charsets. If they differ from each other it may cause doubled or trimmed line feeds. Also you can get some corrupted characters in your strings.

Original:

<div id='tmp'>
   deja vú
</div>

Doubled:

<div id='tmp'>

   deja vÃº

</div>

Trimmed:

<div id='tmp'>       deja v�    </div>

UTF8 makes an extra line on my site

2 Answers2

Linked