-2

We have a requirement to decompress some data created by a Java system using the DEFLATE algorithm. This we have no control over.

While we don't know the exact variant, we are able to decompress data sent to us using the following Java code:

 public static String inflateBase64(String base64)
    {
        try (Reader reader = new InputStreamReader(
                new InflaterInputStream(
                        new ByteArrayInputStream(
                                Base64.getDecoder().decode(base64)))))
        {
            StringWriter sw = new StringWriter();
            char[] chars = new char[1024];
            for (int len; (len = reader.read(chars)) > 0; )
                sw.write(chars, 0, len);
            return sw.toString();
        }
        catch (IOException e)
        {
            System.err.println(e.getMessage());
            return "";
        }
    }

Unfortunately, our ecosystem is C# based. We're shelling out to the Java program at the moment using the Process object but this is clearly sub-optimal from a performance point of view so we'd like to port the above code to C# if at all possible.

Some sample input and output:

>java -cp . Deflate -c "Pack my box with five dozen liquor jugs."
eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y
>java -cp . Deflate -d eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y
Pack my box with five dozen liquor jugs.
>

We're told the Java system conforms to RFC 1951 so we've looked at quite a few libraries but none of them seem to decompress the data correctly (if at all). One example is DotNetZip:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Ionic.Zlib;

namespace Decomp
{
    class Program
    {
        static void Main(string[] args)
        {                        
            // Deflate

            String start = "Pack my box with five dozen liquor jugs.";

            var x = DeflateStream.CompressString(start);
            var res1 = Convert.ToBase64String(x, 0, x.Length);

            // Inflate 

            //String source = "eJwLSEzOVsitVEjKr1AozyzJUEjLLEtVSMmvSs1TyMksLM0vUsgqTS/WAwAm/w6Y"; // *** FAILS ***
            String source = "C0hMzlbIrVRIyq9QKM8syVBIyyxLVUjJr0rNU8jJLCzNL1LIKk0v1gMA"; 

            var part1 = Convert.FromBase64String(source);            

            var res2 =  DeflateStream.UncompressString(part1);
        }
    }
}

This implements RFC 1951 according to the documentation, but does not decipher the string correctly (presumably due to subtle algorithm differences between implementations).

From a development point of view we could do with understanding the exact variant we need to write. Is there any header information or online tools we could use to provide an initial steer? It feels like we're shooting in the dark a little bit here.

Robbie Dee
  • 1,939
  • 16
  • 43
  • The -cp is creating a tar zip file. So you need an unzip utility – jdweng Jan 14 '19 at 14:37
  • @jdweng **-cp** just specifies where the class sits in the file system I believe... – Robbie Dee Jan 14 '19 at 14:49
  • What about implementing RFC 1951? Not an option? – Fildor Jan 14 '19 at 15:03
  • @Fildor We could but we have no guarantee of success. Many libraries implement RFC 1951 but the crux of the problem seems to be knowing the specific algorithm variant. I've edited the question to illustrate the problem. – Robbie Dee Jan 14 '19 at 15:15
  • See following : https://www.cbronline.com/what-is/what-is-java-cp-4926798/ – jdweng Jan 14 '19 at 15:16
  • @jdweng The execution of the Java piece isn't any issue - this is about porting the working code to C#. – Robbie Dee Jan 16 '19 at 13:23
  • I was just posting the method that jave was using which was to "unzip" a file which is different from a normal "uncompress". – jdweng Jan 16 '19 at 14:11

1 Answers1

3

https://www.nuget.org/packages/ICSharpCode.SharpZipLib.dll/

using ICSharpCode.SharpZipLib.Zip.Compression.Streams;
using System;
using System.IO;
using System.Text;

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string input = "Pack my box with five dozen liquor jugs.";

            string encoded = Encode(input);
            string decoded = Decode(encoded);

            Console.WriteLine($"Input: {input}");
            Console.WriteLine($"Encoded: {encoded}");
            Console.WriteLine($"Decoded: {decoded}");

            Console.ReadKey(true);
        }

        static string Encode(string text)
        {
            byte[] bytes = Encoding.UTF8.GetBytes(text);

            using (MemoryStream inms = new MemoryStream(bytes))
            {
                using (MemoryStream outms = new MemoryStream())
                {
                    using (DeflaterOutputStream dos = new DeflaterOutputStream(outms))
                    {
                        inms.CopyTo(dos);

                        dos.Finish();

                        byte[] encoded = outms.ToArray();                                              

                        return Convert.ToBase64String(encoded);
                    }
                }
            }
        }

        static string Decode(string base64)
        {
            byte[] bytes = Convert.FromBase64String(base64);

            using (MemoryStream ms = new MemoryStream(bytes))
            {
                using (InflaterInputStream iis = new InflaterInputStream(ms))
                {
                    using (StreamReader sr = new StreamReader(iis))
                    {
                        return sr.ReadToEnd();
                    }
                }
            }
        }
    }
}
Woldemar89
  • 682
  • 4
  • 10