4

We're creating Sitemap XML-files and pointing to them in Google Webmaster Tools, which sporadically gives the following error-message to some of the files:

Compression error

The "How to fix" in Google's documentation doesn't really give any hints at what could be wrong.

The file is generated in .NET and compress with System.IO.Compression.GZipStream and through MSDN's recommended way of use.

It does work when we open the file in 7-zip and just re-save the file, without any changes.

Any hints?

John Conde
  • 217,595
  • 99
  • 455
  • 496
Seb Nilsson
  • 26,200
  • 30
  • 103
  • 130
  • I would binary-compare original and 7-zip resaved file. If they differ, that could give a hint. If they not - then error is probably on the Google's side. – Petr Abdulin Feb 17 '14 at 10:05
  • They differ. Doesn't get me any closer. Only means 7-zip has some difference in its implementation of GZip. – Seb Nilsson Feb 17 '14 at 10:31
  • Oh, well, that was pretty stupid suggestion, since they would differ significantly because of different compression level. Could you possible post any sample file that fails? – Petr Abdulin Feb 18 '14 at 03:23
  • The difference, in the file I randomly tested, was only a change from 51k to 56k. I'll try to post a sample-file, but not sure how that would help. It's not corrupt, since 7zip can open it and read this file. Maybe you could create an SO-answer here with your though out steps of debugging this problem? – Seb Nilsson Feb 18 '14 at 08:05
  • 1
    Why are you compressing it? If your sitemap is hosted on IIS, IIS will send it with GZip correctly along with correct headers. – Akash Kava Feb 23 '14 at 10:49

3 Answers3

1

OK, here's my thoughts on the problem. It's obvious that System.IO.Compression.GZipStream produces file that is not corrupt, but still have minor issues, that doesn't like Google.

Strightforward solution-and-check would be to switch to other compression library, and see if that helps.

A little more complicated solution would be to do a strict check of GZIP file format specs. Specifically I would check (compare) headers of the files (original and 7zip). This way you could possibly find that is wrong with the file and, possibly, fix it.

Community
  • 1
  • 1
Petr Abdulin
  • 33,883
  • 9
  • 62
  • 96
0

+1 for @Akash's answer. I've had trouble in IIS (especially IIS 6) when attempting to access compressed content. Let IIS do the compression, just put the uncompressed xml file in a convenient location.

robrich
  • 13,017
  • 7
  • 36
  • 63
0

If you have access on .htaccess file then i give you some tips to edit it and let your all file will be cached and compressed automatically. okay here is the tips to put in your website root directory that contains .htaccess file .

## EXPIRES CACHING ##
<IfModule mod_expires.c>
ExpiresActive On
ExpiresByType image/jpg "access 1 year"
ExpiresByType image/jpeg "access 1 year"
ExpiresByType image/gif "access 1 year"
ExpiresByType image/png "access 1 year"
ExpiresByType text/css "access 1 month"
ExpiresByType text/html "access 1 month"
ExpiresByType application/pdf "access 1 month"
ExpiresByType text/x-javascript "access 1 month"
ExpiresByType application/x-shockwave-flash "access 1 month"
ExpiresByType image/x-icon "access 1 year"
ExpiresDefault "access 1 month"
</IfModule>


<IfModule mod_headers.c>
  <FilesMatch "\.(js|css|xml|gz)$">
    Header append Vary: Accept-Encoding
  </FilesMatch>
</IfModule>

# compress text, html, javascript, css, xml:
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript

you can check out if your sitemap will be compressed your not via some tools like woorank.com , its will display yes your site take advantages of Gzip , Yes it will automatically done when Google crawl your sitemap and any other file included in above code

Vikram Parmar
  • 11
  • 1
  • 4