76

Often I need to download a webpage and then edit it offline. I have tried a few tools and the main feature they lack is downloading images referenced in the CSS files.

Is there a tool (for Linux) that will download everything so that the webpage will render the same offline (excluding AJAX)?

Philip Kirkbride
  • 21,381
  • 38
  • 125
  • 225
hoju
  • 28,392
  • 37
  • 134
  • 178
  • 3
    This worked for me the best:::::::::::: wget --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --no-parent http://example.com/ – Rehmat May 06 '16 at 05:50
  • `--html-extension` is deprecated as of v1.12. I recommend this: `wget -U "Opera 11.0" --page-requisites --content-on-error --no-clobber --convert-links --restrict-file-names=windows --no-parent "http://stackoverflow.com"` It's very important to put url in double quotes, otherwise, it will get stuck on `Redirecting output to ‘wget-log’.`. – Shayan Sep 04 '19 at 17:35
  • Related: https://superuser.com/questions/55040/save-a-single-web-page-with-background-images-with-wget/136335#136335 – Ciro Santilli OurBigBook.com Sep 18 '19 at 15:13

7 Answers7

97
wget --page-requisites http://example.com/your/page.html

This option causes Wget to download all the files that are necessary to properly display a given html page. This includes such things as inlined images, sounds, and referenced stylesheets.

EDIT: meder is right: stock wget does not parse and download css images. There is, however, a patch that adds this feature: [1, 2]

UPDATE: The patch mentioned above has been merged into wget 1.12, released 22-Sep-2009:

** Added support for CSS. This includes:
 - Parsing links from CSS files, and from CSS content found in HTML
   style tags and attributes.
 - Supporting conversion of links found within CSS content, when
   --convert-links is specified.
 - Ensuring that CSS files end in the ".css" filename extension,
   when --convert-links is specified.
ax.
  • 58,560
  • 8
  • 81
  • 72
  • 7
    As far as I know, this won't download images referenced in CSS files, which is what the OP intended. I think you would have to write a script that parses the css files, or find one someone's made, I'm curious about this too though. – meder omuraliev Oct 17 '09 at 06:35
  • You should download the whole directory images recursively – OscarRyz Oct 17 '09 at 08:19
  • seems that patch has been around since 07, and still not integrated... – hoju Oct 18 '09 at 23:17
  • On current Ubuntu version it works fine now. (Just for everyone else which founds this post when searching with google...) – TheHippo Jun 14 '11 at 13:56
  • 3
    It seems wget 1.13.4 still has trouble finding CSS files linked by using the `@import` syntax. – Flimm Sep 22 '12 at 10:33
  • wget doesn't download css content from style tags, version 1.14 on ubuntu. – Dmitrii Mikhailov Nov 18 '13 at 12:04
  • wget working same like web sniffer(http://web-sniffer.net/) its not downloading all image,css and javascript. – Abhijit Jagtap Sep 16 '16 at 12:06
  • I try this for https://hea-www.harvard.edu/~fine/Tech/vi.html but it can't download images. – alhelal Feb 12 '18 at 02:48
  • @ax. Can we accomplish the same thing using `curl`? What's the option for it? – Shayan Sep 03 '19 at 23:46
  • 1
    @Shayan No - `curl` can NOT download whole web pages, because it cannot parse HTML: https://ec.haxx.se/usingcurl-downloads.html#client-differences – ax. Sep 04 '19 at 12:35
12

It's possible to do this through Firefox, see this form

  1. Right click
  2. View page info
  3. Select media tab
  4. Highlight all files
  5. Save as

Reference - http://www.webdeveloper.com/forum/showthread.php?t=212610

Jonathan
  • 161
  • 1
  • 2
  • 10
    This does not help when it comes to saving css or js files – LiveSource Jan 08 '13 at 10:27
  • 4
    doesn't get CSS, which was specified by the Op. It's a cool trick/process though. Would not have thought of it myself. Thanks for posting. – BishopZ May 01 '13 at 18:54
  • 1
    It worked for me, saved all the PNGs used via css, great thanks. – user9349193413 Jul 25 '13 at 09:09
  • 3
    It does download images referenced in CSS files. So if it's only about images and other media, this will do. – SPRBRN Nov 01 '13 at 11:51
  • This worked great for me, and did not require me to grab any new tools. – Dan Jun 05 '14 at 18:50
  • This has the major downside that it doesn't download them as a webpage: it's just a chaotic mess of files. But if it's what you want, then it works. – Graham Jan 08 '19 at 02:19
11

I ran into the same problem the other day working for a client. Another tool that works really well is HTTrack. The software is available in a commandline verison for both windows and Linux. For Linux they prebuilt packages for most of the more common operating systems found here

For my purposes it worked better than wget with some of the added features/switches that fix links inside the html file.

Everette Mills
  • 371
  • 3
  • 10
8

wget is a great choice for you. Just for more information, the wget version on windows at this time there is no official release on gnu for wget version 1.12. The current version is 1.11

wget version 1.11 cannot download images/fonts in css files. Fortunately, you can find a build of 1.14 from this page. It fixed these problems.

http://opensourcepack.blogspot.com/2010/05/wget-112-for-windows.html

Tran Dang Khoa
  • 331
  • 3
  • 3
4

The current version of Opera (12) allows to save a page as 'HTML with images'.

Thereby Opera also downloads images which are referenced in the CSS files and adapts the image URLs in the CSS accordingly.

Marco
  • 139
  • 3
  • 5
2

In Firefox:

File->Save Page As->Web Page, Complete

Saves all javascript and images and css. Nothing else required :)

LiveSource
  • 6,230
  • 4
  • 21
  • 20
  • 8
    Unfortunately, this method won't download images referenced within CSS files (in currently latest FF 21 and lower). – sgnsajgon Jun 17 '13 at 22:32
-9
wget 
Esteban Küber
  • 36,388
  • 15
  • 79
  • 97
OscarRyz
  • 196,001
  • 113
  • 385
  • 569