32

A forum I frequent was down today, and upon restoration, I discovered that the last two days of forum posting had been rolled back completely.

Needless to say, I'd like to get back what data I can from the forum loss, and I am hoping I have at least some of it stored in the cache files that Chrome created.

I face two problems -- the cache files have no filetype, and I'm unsure how to read them in an intelligent manner (trying to open them in Chrome itself seems to "redownload" them in a .gz format), and there are a ton of cache files.

Any suggestions on how to read and sort these files? (A simple string search should fit my needs)

Raven Dreamer
  • 6,940
  • 13
  • 64
  • 101

12 Answers12

32

EDIT: The below answer no longer works see here


In Chrome or Opera, open a new tab and navigate to chrome://view-http-cache/

Click on whichever file you want to view. You should then see a page with a bunch of text and numbers. Copy all the text on that page. Paste it in the text box below.

Press "Go". The cached data will appear in the Results section below.

aug
  • 11,138
  • 9
  • 72
  • 93
A. Zalonis
  • 1,599
  • 6
  • 26
  • 41
  • 1
    The file you receive might be an unreadable dump. Send the file through this php script to extract the contents: http://www.sensefulsolutions.com/2012/01/viewing-chrome-cache-easy-way.html – Druska Nov 23 '13 at 23:43
  • 5
    you didn't even mention you are using sensefulsolutions page. – zinking Dec 15 '13 at 13:47
  • 2
    Or just copy the hexdump for a file to the clipboard and then run `pbpaste | xxd -r -p > file.ext`, replacing `pbpaste` with your operating system’s equivalent for this OS X utility. – Mathias Bynens Sep 24 '14 at 22:09
  • 6
    This will not work anymore, cause `chrome://view-http-cache` is removed from recent chrome versions. Fore more details see [this](https://superuser.com/questions/1316540/where-has-chrome-cache-been-moved-to). – Slava Bacherikov Aug 17 '18 at 18:05
27

Try Chrome Cache View from NirSoft (free).

yakatz
  • 2,142
  • 1
  • 18
  • 47
  • My antivirus program (Trend Micro) is shooting me warnings about that page -- can you validate its safe-ness? – Raven Dreamer May 26 '11 at 04:16
  • 2
    @Raven, I don't know the guy personally, but I have used many of his programs. What specifically does your antivirus say? The same site has what some people call hacking tools (i.e. password recovery) – yakatz May 26 '11 at 04:22
  • @Yakatz - nothing. It won't let me access the site at all because it's "a potential security risk". Guess I'll just have to disable it then. – Raven Dreamer May 26 '11 at 04:33
  • "The latest tests indicate that this site contains malicious software or could defraud visitors." – Raven Dreamer May 26 '11 at 04:35
  • 2
    @Raven, I don't see ratings like that about this site on other sites: http://www.mywot.com/en/scorecard/nirsoft.net. Google SafeBrowsing (http://www.google.com/safebrowsing/diagnostic?site=nirsoft.net) says the site has trojans on it, but those are likely false positives (since many security tools show up as trojans). There are no drive-by downloads, so you are safe anyway. I am sure the site is fine. As I said, I use his tools all the time. – yakatz May 26 '11 at 04:52
  • 1
    Unfortunately, Trend Microscan makes it impossible to override or temporarily turn itself off. Thankfully, I have two computers, and a flash drive. – Raven Dreamer May 26 '11 at 06:56
  • 3
    And also works on Mac under Wine. The folder for the main Chrome profile will be something like `H:\Library\Caches\Google\Chrome\Default\Cache` assuming that `H:` is mapped to your home folder. – ccpizza Sep 13 '17 at 00:58
  • Isn't there a built-in browser tool to use? – jengeb Jan 29 '19 at 20:39
  • ChromeCacheView worked for me on macOS Catalina using [PlayOnMac](https://www.playonmac.com). For details and updates see [here](https://apple.stackexchange.com/questions/373851/how-to-get-wine-working-on-catalina). – Stefan Schmidt Mar 20 '23 at 01:58
9

EDIT: The below answer no longer works see here


Chrome stores the cache as a hex dump. OSX comes with xxd installed, which is a command line tool for converting hex dumps. I managed to recover a jpg from my Chrome's HTTP cache on OSX using these steps:

  1. Goto: chrome://cache
  2. Find the file you want to recover and click on it's link.
  3. Copy the 4th section to your clipboard. This is the content of the file.
  4. Follow the steps on this gist to pipe your clipboard into the python script which in turn pipes to xxd to rebuild the file from the hex dump: https://gist.github.com/andychase/6513075

Your final command should look like:

pbpaste | python chrome_xxd.py | xxd -r - image.jpg

If you're unsure what section of Chrome's cache output is the content hex dump take a look at this page for a good guide: http://www.sparxeng.com/blog/wp-content/uploads/2013/03/chrome_cache_html_report.png

Image source: http://www.sparxeng.com/blog/software/recovering-images-from-google-chrome-browser-cache

More info on XXD: http://linuxcommand.org/man_pages/xxd1.html

Thanks to Mathias Bynens above for sending me in the right direction.

slm
  • 15,396
  • 12
  • 109
  • 124
k0nG
  • 4,716
  • 2
  • 20
  • 18
  • 2
    `chrome://cache` was removed in later versions of Chrome so this will no longer work. – slm Apr 24 '19 at 02:50
9

EDIT: The below answer no longer works see here


If the file you try to recover has Content-Encoding: gzip in the header section, and you are using linux (or as in my case, you have Cygwin installed) you can do the following:

  1. visit chrome://view-http-cache/ and click the page you want to recover
  2. copy the last (fourth) section of the page verbatim to a text file (say: a.txt)
  3. xxd -r a.txt| gzip -d

Note that other answers suggest passing -p option to xxd - I had troubles with that presumably because the fourth section of the cache is not in the "postscript plain hexdump style" but in a "default style".

It also does not seem necessary to replace double spaces with a single space, as chrome_xxd.py is doing (in case it is necessary you can use sed 's/ / /g' for that).

slm
  • 15,396
  • 12
  • 109
  • 124
qbolec
  • 5,374
  • 2
  • 35
  • 44
  • Worked great for me, none of the other methods did, thanks! – Mahn Aug 15 '16 at 15:59
  • Without even having to save to a file: Select the part below the header, and use `xsel | xxd -r | zcat | less` (omit `| less` if you don't want a pager). – Rob W Aug 29 '17 at 21:27
  • 1
    `chrome://view-http-cache/` was removed in newer versions of Chrome. – slm Apr 24 '19 at 02:51
8

Note: The flag show-saved-copy has been removed and the below answer will not work


You can read cached files using Chrome alone.

Chrome has a feature called Show Saved Copy Button:

Show Saved Copy Button Mac, Windows, Linux, Chrome OS, Android

When a page fails to load, if a stale copy of the page exists in the browser cache, a button will be presented to allow the user to load that stale copy. The primary enabling choice puts the button in the most salient position on the error page; the secondary enabling choice puts it secondary to the reload button. #show-saved-copy

First disconnect from the Internet to make sure that browser doesn't overwrite cache entry. Then navigate to chrome://flags/#show-saved-copy and set flag value to Enable: Primary. After you restart browser Show Saved Copy Button will be enabled. Now insert cached file URI into browser's address bar and hit enter. Chrome will display There is no Internet connection page alongside with Show saved copy button: enter image description here

After you hit the button browser will display cached file.

kynan
  • 13,235
  • 6
  • 79
  • 81
Leonid Vasilev
  • 11,910
  • 4
  • 36
  • 50
6

The Google Chrome cache directory $HOME/.cache/google-chrome/Default/Cache on Linux contains one file per cache entry named <16 char hex>_0 in "simple entry format":

  • 20 Byte SimpleFileHeader
  • key (i.e. the URI)
  • payload (the raw file content i.e. the PDF in our case)
  • SimpleFileEOF record
  • HTTP headers
  • SHA256 of the key (optional)
  • SimpleFileEOF record

If you know the URI of the file you're looking for it should be easy to find. If not, a substring like the domain name, should help narrow it down. Search for URI in your cache like this:

fgrep -Rl '<URI>' $HOME/.cache/google-chrome/Default/Cache

Note: If you're not using the default Chrome profile, replace Default with the profile name, e.g. Profile 1.

kynan
  • 13,235
  • 6
  • 79
  • 81
5

I've made short stupid script which extracts JPG and PNG files:

#!/usr/bin/php
<?php
 $dir="/home/user/.cache/chromium/Default/Cache/";//Chrome or chromium cache folder. 
 $ppl="/home/user/Desktop/temporary/"; // Place for extracted files 

 $list=scandir($dir);
 foreach ($list as $filename)
 {

 if (is_file($dir.$filename))
    {
        $cont=file_get_contents($dir.$filename);
        if  (strstr($cont,'JFIF'))
        {
            echo ($filename."  JPEG \n");
            $start=(strpos($cont,"JFIF",0)-6);
            $end=strpos($cont,"HTTP/1.1 200 OK",0);
            $cont=substr($cont,$start,$end-6);
            $wholename=$ppl.$filename.".jpg";
            file_put_contents($wholename,$cont);
            echo("Saving :".$wholename." \n" );


                }
        elseif  (strstr($cont,"\211PNG"))
        {
            echo ($filename."  PNG \n");
            $start=(strpos($cont,"PNG",0)-1);
            $end=strpos($cont,"HTTP/1.1 200 OK",0);
            $cont=substr($cont,$start,$end-1);
            $wholename=$ppl.$filename.".png";
            file_put_contents($wholename,$cont);
            echo("Saving :".$wholename." \n" );


                }
        else
        {
            echo ($filename."  UNKNOWN \n");
        }
    }
 }
?>
4

I had some luck with this open-source Python project, seemingly inactive: https://github.com/JRBANCEL/Chromagnon

I ran:

python2 Chromagnon/chromagnonCache.py path/to/Chrome/Cache -o browsable_cache/

And I got a locally-browsable extract of all my open tabs cache.

Lucas Cimon
  • 1,859
  • 2
  • 24
  • 33
2

It was removed on purpose and it won't be coming back.

Both chrome://cache and chrome://view-http-cache have been removed starting chrome 66. They work in version 65.

Workaround

You can check the chrome://chrome-urls/ for complete list of internal Chrome URLs.

The only workaround that comes into my mind is to use menu/more tools/developer tools and having a Network tab selected.

The reason why it was removed is this bug:

The discussion:

slm
  • 15,396
  • 12
  • 109
  • 124
Shabeer K
  • 1,489
  • 16
  • 23
0

The JPEXS Free Flash Decompiler has Java code to do this at in the source tree for both Chrome and Firefox (no support for Firefox's more recent cache2 though).

hemisphire
  • 1,205
  • 9
  • 19
0

EDIT: The below answer no longer works see here


Google Chrome cache file format description.

Cache files list, see URLs (copy and paste to your browser address bar):

  • chrome://cache/
  • chrome://view-http-cache/

Cache folder in Linux: $~/.cache/google-chrome/Default/Cache

Let's determine in file GZIP encoding:

$ head f84358af102b1064_0 | hexdump -C | grep --before-context=100 --after-context=5 "1f 8b 08"

Extract Chrome cache file by one line on PHP (without header, CRC32 and ISIZE block):

$ php -r "echo gzinflate(substr(strchr(file_get_contents('f84358af102b1064_0'), \"\x1f\x8b\x08\"), 10,
-8));"
slm
  • 15,396
  • 12
  • 109
  • 124
Rinat
  • 608
  • 5
  • 5
0

Note: The below answer is out of date since the Chrome disk cache format has changed.


Joachim Metz provides some documentation of the Chrome cache file format with references to further information.

For my use case, I only needed a list of cached URLs and their respective timestamps. I wrote a Python script to get these by parsing the data_* files under C:\Users\me\AppData\Local\Google\Chrome\User Data\Default\Cache\:

import datetime
with open('data_1', 'rb') as datafile:
    data = datafile.read()

for ptr in range(len(data)):
    fourBytes = data[ptr : ptr + 4]
    if fourBytes == b'http':

        # Found the string 'http'. Hopefully this is a Cache Entry
        endUrl = data.index(b'\x00', ptr)
        urlBytes = data[ptr : endUrl]
        try:
            url = urlBytes.decode('utf-8')
        except:
            continue

        # Extract the corresponding timestamp
        try:
            timeBytes = data[ptr - 72 : ptr - 64]
            timeInt = int.from_bytes(timeBytes, byteorder='little')
            secondsSince1601 = timeInt / 1000000
            jan1601 = datetime.datetime(1601, 1, 1, 0, 0, 0)
            timeStamp = jan1601 + datetime.timedelta(seconds=secondsSince1601)
        except:
            continue

        print('{} {}'.format(str(timeStamp)[:19], url))
kynan
  • 13,235
  • 6
  • 79
  • 81
krubo
  • 5,969
  • 4
  • 37
  • 46