My assignment is to compress a DNA sequence. First enconding using a = 00 c = 01 g = 10 t = 11. I have to read in from a file the sequence and covert to my encoding. i know i have to use the bitSet class in java, but I'm having issues with how to implement. How do I ensure my encoding is used and the letters are not converted to actual binary.
this is the prompt: Develop space efficient Java code for two kinds of compressed encodings of this file of data. (N's are to be ignored). Convert lower case to upper case chars. Do the following and answer the questions: Credit will be awarded to both time and space efficient mechanisms. If your code takes too long to run, you need to rethink design.
Encoding 1. Using two bits A:00, C:01, G:10, T:11.
(a) How many total bits are needed to represent the genome sequence ? (b) how many of the total bits are 1's in the encoded sequence?
i know the logic i have to use, but the actual implementation of the bitSet class and the encoding is where i'm having issues.