-2

I have this giant string, I was wondering if it can be compressed and if so what are some good ways to do so.

"01011311100111111112110131131011111110111011113111101101001110110110100110001001111003011011101111311102110011030111001311110113110111110111111111111111311103010001113110013100100101110000010111111111001000111111100001100030111111131113113101101001100111111100110100131001102101101110030300300011011111001111100010110011201111111011110011101011000011100013110101111003000131111012011131000000113111111311111001100111011111000101111101313111010000001131103011210111101001110010100113111311000111001100011110001000001111110001111111001010001011111100111000131000"

This is a sample and there are thousands more lines. Any suggestions?

  • 1
    not entirely sure if it 's something you would want, but 0000011111110000111 could become[0:5][1:7][0:4][1:3] (or similar. takes some additional functionality, but especially for large Strings like that, this could make it a lot shorter – Stultuske Sep 01 '17 at 11:41
  • 1
    Compression is a well-studied subject and there are plenty of libraries and tools available which does this. Asking on [so] is not a good substitute for doing research yourself. – Bernhard Barker Sep 01 '17 at 11:43
  • Based on that data i would say its about the encoding. If you take 8bit per char you have 256 possible chars when you only need 0-9. Maybe try to sum up like 10 chars, convert to number and store them in a long or something. Then make a list/arry of your longs – Tom Stein Sep 01 '17 at 11:45
  • Possibly related to https://stackoverflow.com/questions/40417632/dna-compression-using-bitset-java. – yegodm Sep 01 '17 at 13:01

1 Answers1

0

I suggest trying out-of-the-box solutions before thinking of implementing your own compression algorithm. Here you might try out java.util.zip.GZIPInputStream and java.util.zip.GZIPOutputStream if this leads to sufficiently compressed results. Only if you are unhappy with the result you might think of own schemes.

Lothar
  • 5,323
  • 1
  • 11
  • 27