2

I'm trying to decode the jsonlz4 file that contains my saved tabs from Firefox (saved in %appdata%\Mozilla\Firefox\Profiles\xxxxxx.default-release\sessionstore-backups\recovery.jsonlz4, in the hope of eventually being able to parse the json and extract the URLs in my tabs and maybe other data from my session.

I was hoping that the lz4-pure-java library could decompress it to json.

I'm trying to use Method 2 Example from the lz4-java github, and a comment here says that the file should be standard lz4 if we skip the 12-byte header.

Here's my code:

package com.jsonparser;

import net.jpountz.lz4.LZ4Factory;
import net.jpountz.lz4.LZ4SafeDecompressor;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;

public class jsonLZ4 {
    public static void main(String[] args) throws IOException {

        String infile = "C:\\Users\\username\\AppData\\Roaming\\Mozilla\\Firefox\\Profiles\\xxxxxx.default-release\\sessionstore-backups\\recovery.jsonlz4";

        byte[] datain = Files.readAllBytes(Paths.get(infile));

        // need to skip the first 12 bytes for Firefox format

        byte[] data = Arrays.copyOfRange(datain,12,datain.length);

        LZ4Factory factory = LZ4Factory.fastestInstance();

        byte[] compressed = data;
        int compressedLength = compressed.length;
        byte[] restored = new byte[compressed.length*2]; // not sure how to set length properly without knowing decompressed size

        // - method 2: when the compressed length is known (a little slower)
        // the destination buffer needs to be over-sized

        LZ4SafeDecompressor decompressor2 = factory.safeDecompressor();
        int decompressedLength2 = decompressor2.decompress(compressed, 0, compressedLength, restored, 0);

        String s = new String(restored, StandardCharsets.UTF_8);
        System.out.println("decompressed data is: " + s);
    }
}

Unfortunately I'm getting a decompression error:

Exception in thread "main" net.jpountz.lz4.LZ4Exception: Malformed input at 1803197
    at net.jpountz.lz4.LZ4JavaUnsafeSafeDecompressor.decompress(LZ4JavaUnsafeSafeDecompressor.java:62)
    at net.jpountz.lz4.LZ4SafeDecompressor.decompress(LZ4SafeDecompressor.java:77)
    at com.jsonparser.jsonLZ4.main(jsonLZ4.java:31)

Process finished with exit code 1

Does anyone know how I can successfully decompress this file, preferably using only java?

Thanks.

localhost
  • 1,253
  • 4
  • 18
  • 29

2 Answers2

1

The jsonlz4 format is its own format. It's based on lz4, but adds its own header logic. Consequently, it's not decodable by "normal" lz4 decoders.

If you are looking for a ready-to-use decoder source code, the lz4 home page lists one project dedicated to this goal (though it's written in C) :

jsonlz4 decoder, custom Mozilla LZ4 format, by Avi Halachmi : https://github.com/avih/dejsonlz4

For a different implementation, for example using Java, you'll be on your own. Either develop your own header decoder after reading the format's documentation, and then pass the remaining payload to LZ4, or try to use JNI to invoke C functions from Java.

Cyan
  • 13,248
  • 8
  • 43
  • 78
  • I'm looking for a java decoder. Are you sure it's different to standard lz4? I've seen multiple people mention that it is standard lz4 except for the extra header, as I mentioned in the question. – localhost Aug 12 '21 at 12:49
  • eg https://www.reddit.com/r/firefox/comments/3offju/jsonlz4_file/cvx4a9n/ - "This file format is in fact just plain LZ4 data with a custom header (magic number [8 bytes] and uncompressed file size [4 bytes, little endian])." – localhost Aug 12 '21 at 12:59
  • 1
    We may spell it differently, but I believe we say essentially the same thing. – Cyan Aug 14 '21 at 04:30
1

The Firefox header is 8 bytes, not 12 bytes.

As said, simply skip the 8-byte header, and decompress the compressed data stream.

Small example in Perl language:

    #!/usr/bin/env perl
    use strict;
    use warnings FATAL => "all";
    use autodie;
    
    use English;
    use Fcntl qw( :seek );
    use File::Spec;
    use File::stat;
    use JSON::PP;
    
    use Compress::LZ4;
    
    if (not @ARGV)
    {
        print("$PROGRAM_NAME <file.jsonlz4>\n");
        exit(1);
    }
    my $path = $ARGV[0];
    
    my $st = stat($path);
    
    open(my $fh, '<:raw', $path);
    
    # skip header
    seek($fh, 8, SEEK_SET);
    
    my $leido = read($fh, my $bytes_jsonlz4, $st->size - 8);
    die unless $leido == ($st->size - 8);
    
    close($fh);
    
    my $bytes = lz4_decompress($bytes_jsonlz4);
    
    my $perl_data = JSON::PP->new->utf8()->decode($bytes);
    my $json = JSON::PP->new->utf8->pretty->indent;

    print($json->encode($perl_data));
    
    1;
Dharman
  • 30,962
  • 25
  • 85
  • 135
  • have you tried this on a Firefox `.jsonlz4` file, such as `recovery.jsonlz4`? Standard lz4 decoders fail because of the non-standard header. Eg see https://superuser.com/a/1363751/130337 - see the last point about lz4 blocks vs frames. I would like to be able to extract using a standard lz4 decompressor in java. – localhost Feb 22 '22 at 10:20
  • 1
    @localhost, yes, I've run the script on my profile's `.jsonlz4` files (recovery.jsonlz4, previous.jsonlz4, and bookmarkbackups). The script calls a function that deals with **raw** LZ4 data. See [link](https://metacpan.org/pod/Compress::LZ4#COMPATIBILITY). Have you tried `byte[] data = Arrays.copyOfRange(datain,8,datain.length);` – The Linux Kitten Feb 22 '22 at 22:06