I'm coding LZ77 compression algorithm, and I have trouble storing unsigned chars in a string. To compress any file, I use its binary representation and then read it as chars
(because 1 char is equal to 1 byte, afaik) to a std::string
. Everything works perfectly fine with chars
. But after some time googling I learned that char
is not always 1 byte, so I decided to swap it for unsigned char
. And here things start to get tricky:
- When compressing plain .txt, everything works as expected, I get equal files before and after decompression (I assume it should, because we basically work with text before and after byte conversion)
- However, when trying to compress .bmp, decompressed file loses 3 bytes compared to input file (I lose these 3 bytes when trying to save unsigned chars to a std::string)
So, my question is – is there a way to properly save unsigned chars to a string?
I tried to use typedef basic_string<unsigned char> ustring
and swap all related functions for their basic alternatives to use with unsigned char
, but I still lose 3 bytes.
UPDATE: I found out that 3 bytes (symbols) are lost not because of std::string, but because of
std::istream_iterator
(that I use instead ofstd::istreambuf_iterator
) to create string of unsigned chars (becausestd::istreambuf_iterator
's argument is char, not unsigned char)
So, are there any solutions to this particular problem?
Example:
std::vector<char> tempbuf(std::istreambuf_iterator<char>(file), {}); // reads 112782 symbols
std::vector<char> tempbuf(std::istream_iterator<char>(file), {}); // reads 112779 symbols
Sample code:
void LZ77::readFileUnpacked(std::string& path)
{
std::ifstream file(path, std::ios::in | std::ios::binary);
if (file.is_open())
{
// Works just fine with char, but loses 3 bytes with unsigned
std::string tempstring = std::string(std::istreambuf_iterator<char>(file), {});
file.close();
}
else
throw std::ios_base::failure("Failed to open the file");
}