0

I want my .html files to be encoded in utf-8. I put the meta in the html files and globalization settings in Web.config, but still I observe that my GET requests contain Request headers such as:

Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-US,en;q=0.8

How can I change them to add utf-8 and "tr" language? Is this the reason I see weird characters in the server responses where chars like 'ç' or 'ö' should be in my html files?

Halo
  • 1,524
  • 3
  • 24
  • 39

1 Answers1

1

The Accept-Encoding header tells the server what compression algorithms you can handle, for instance, the server might send the response gzipped because you told it that you can handle it.

The character encodings that you can handle are signaled to the server in the Accept-Charset header. For example Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3

Anyway, if you just want your html files to be encoded in UTF-8, all you need to do is to save those files in UTF-8 encoding. This depends on your text editor.

Esailija
  • 138,174
  • 23
  • 272
  • 326
  • my problem is this: can you open http://renklitablo.com ? When you view the source of the frontpage, for example you'll see this 'Üye Ol'. But I want that to be 'Üye Ol' – Halo Dec 12 '12 at 10:58
  • And I checked, they are saved in UTF-8. checked it with notepad – Halo Dec 12 '12 at 10:59
  • The problem isn't with the html files I think. If I manually write 'Ü' inside my .cshtml file, it is displayed correctly. When the html response is returned from the server, that's when the chars are incorrectly encoded. – Halo Dec 12 '12 at 11:07
  • @Halo what do you mean? The physical encoding is always the same. Do you see `"Ü"`? – Esailija Dec 12 '12 at 11:21
  • no, physical encoding is alright. For example, I use UrlHelper.Action and the link that it produces is seen as '/Tablo/Listele?temaId=9&page=0' in the source. Instead it should be '/Tablo/Listele?temaId=9&page=0'. I'm worried Googlebot cannot access my links because of that – Halo Dec 12 '12 at 11:30
  • @Halo that is valid if it's in html, as in ``. ... ` – Esailija Dec 12 '12 at 11:33
  • so you say "/Tablo/Listele?temaId=9&page=0"> is valid. google crawl errors repeatedly tell me that it tried to go to "/Tablo/Listele?temaId=9" and got a server error. So I figured, it couldn't read the rest of the link because of that 'amp;' thing. Because I don't have an archor that go to "/Tablo/Listele?temaId=9" in my site – Halo Dec 12 '12 at 11:37
  • i see that, it's Google that's giving me headaches. I believe Googlebot should be able to retrieve what every browser in the world successfully does. Those server errors it gives are really disturbing – Halo Dec 12 '12 at 11:47
  • @Halo ok but why did you suddenly derail this to be about SEO? Do you still have any problems regarding your original problem? – Esailija Dec 12 '12 at 11:50
  • I do, but I thought these two originated from the same reason. They do look similar, weird characters coming out from the source. It's really about SEO, in fact. – Halo Dec 12 '12 at 12:00
  • Btw, I tried typing "/Tablo/Listele?temaId=9&page=0" in Fetch as Google, and it didn't fetch; gave an error. This problem, from the looks of it, should belong to a seperate question then, I guess – Halo Dec 12 '12 at 12:02
  • @Halo the link doesn't really link to `"/Tablo/Listele?temaId=9&page=0"`, it links to `"/Tablo/Listele?temaId=9&page=0"`. It's in html source, where `&` is turned into `&` before anything. Look at the jsfiddle I linked earlier, the link is linking to `&` even though the html source has `&`... `&` has a special meaning in HTML, so it can be escaped with `&` to mean a literal `&`. – Esailija Dec 12 '12 at 12:04
  • I understand that, then it's become a Googlebot problem for me, then. I think I can live with seeing 'ö's as 'ö ;' in the source – Halo Dec 12 '12 at 12:14
  • @Halo why not write `ö` literally in your source? – Esailija Dec 12 '12 at 12:17
  • If I write it literally, it works; I see it as 'ö'. The problematic parts are the ones I return from controller actions. If you look inside the source of my frontpage, you'll see that in some parts ç and ö are displayed correctly. Those are the ones I literally wrote in the .cshtml file. But the keywords for example, they come from the server action, and they look weird – Halo Dec 12 '12 at 12:44
  • @Halo I see `keywords="tablo, resim, tuval, baskı, kanvas, canvas, ressam, sanat, galeri, dekoratif"` – Esailija Dec 12 '12 at 12:46
  • sorry: look here http://renklitablo.com/Tablo/Temalar?type=TuvalBaski **sanatçı** – Halo Dec 12 '12 at 12:46
  • @Halo ok, but as long as you don't have to manually write the `"xx;"` stuff, it doesn't matter. Any search engine will parse the html, and it's decoded into `ç` again. – Esailija Dec 12 '12 at 12:56
  • ok, thanks. do you have an idea why Google is warning me that, for example http://renklitablo.com/Tablo/Listele?temaId=3 gives a server error and it should be fixed? It gives the error because it needs an additional page parameter, but I don't address to that particular link from anywhere in my site and I don't know how Google decides to go crawl them. Would you suggest I allow null values for action parameters and try to redirect to somewhere in case of a link with a missing parameter, although I don't link to that ever? – Halo Dec 12 '12 at 13:11
  • @Halo yes it's a huge mystery for me as well just how does google get some links into their index. But it is definitely not the `&` as we have already discussed. – Esailija Dec 12 '12 at 13:16