Why Read Garbage Value Form File

Question

I just want to read a file and then update some of its value , But while reading using CFile , It gives garbage value in sFileContent

Here is my Code

CString sWebAppsFile= _T("C:\\newFile.txt");
CString sFileContent;

CFile file;
int len;

if(file.Open(sWebAppsFile, CFile::modeRead))
{
    len = (int) file.GetLength();
    file.Read(sFileContent.GetBuffer(len), len);
    sFileContent.ReleaseBuffer();
    file.Close();
}

Please provide any solution

What does the file content actually look like? Is it using the same encoding that `TCHAR` is expecting (8bit `char` for ANSI or 16bit `wchar_t` for Unicode)? `GetLength()` returns the file size in bytes, and `Read()` reads raw bytes, but `CString` is expecting `TCHAR`-encoded characters instead. If you try to read 8bit data into a 16bit `CString`, or 16bit data into an 8bit `CString`, you are going to see "garbage". If you really are trying to read raw bytes, why are you using `CString` at all? Consider using `CStringA` instead, or even a more suitable container, like `std::vector`. — Remy Lebeau, Jan 08 '16 at 06:46
While Debuging it look like 效汬⁯潗汲쵤췍췍췍췍췍췍﷽﷽ꮫꮫꮫꮫﻮﻮ — Keen2Learn, Jan 08 '16 at 06:48
That is what happens when you misinterpret 8bit data as if it were 16bit Unicode characters. That data is actually 11 8bit characters `Hello World` followed by 29 bytes of binary data. You should NOT be reading that into a `CString` when `TCHAR` maps to 16bit `wchar_t`. — Remy Lebeau, Jan 08 '16 at 06:50
There's `CStringA`, use that. Also, don't use C-style casts! Prefer not using casts at all but correct types. If you must convert, use C++ casts and consider checking the conversion. — Ulrich Eckhardt, Jan 08 '16 at 07:19
What is the character encoding of your source file? If you don't know the answer, please read [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](http://www.joelonsoftware.com/articles/Unicode.html). Key phrase: *"There Ain't No Such Thing As Plain Text."* — IInspectable, Jan 08 '16 at 14:49

Himanshu · Answer 1 · 2016-01-09T07:21:54.990

0

Use this code

CFile file;
CString sWebAppsFile= _T("C:\\newFile.txt");
CString sFileContent;

if(file.Open(sWebAppsFile, CFile::modeRead))
{
    ULONGLONG dwLength = file.GetLength();
    BYTE *buffer = (BYTE *) malloc(dwLength + 1); // Add 1 extra byte for NULL char
    file.Read(buffer, dwLength);  // read character up to dwLength 
    *(buffer + dwLength) = '\0';  // Make last character NULL so that not to get garbage 
    sFileContent = (CString)buffer;        // transfer data to CString (easy to use)
    //AfxMessageBox(sFileContent); 
    free(buffer);                 // free memory
    file.Close();                 // close File
}

Or you can use CStdioFile

CString sWebAppsFile= _T("C:\\newFile.txt");
CStdioFile file (sWebAppsFile, CStdioFile::modeRead); // Open file in read mode
CString buffer, sFileContent(_T(""));

while (file.ReadString(buffer))         //Read File line by line
    sFileContent += buffer +_T("\n");    //Add line to sFileContent with new line character
//AfxMessageBox(sFileContent );
file.Close();                            // close File

Covert BYTE* to CString

BYTE *buffer;
CString sStr((char*)buffer);
// or for unicode:
CString str((const wchar_t*)buffer);

edited Jan 09 '16 at 07:21

answered Jan 08 '16 at 09:53

Himanshu

4,327
16
31
39

`sFileContent = buffer; // transfer data to CString (easy to use)` - This is a bug. If `CString` is a `CStringW` (the default for any recent version of Visual Studio), it doesn't just *transfer* data. It *converts* character data using the calling thread's locale as the source character encoding. The calling thread's locale is unrelated to the character encoding of the file contents. And it completely misses embedded `NUL` characters, truncating the string. – IInspectable Jan 08 '16 at 14:47
@IInspectable, I have included the code to Covert BYTE* to CString. – Himanshu Jan 09 '16 at 07:22
The conversion code is pretty broken. It may or may not convert to UTF-16, depending on compiler settings. It may or may not use the correct locale for the conversion, depending on the thread's current locale. And it completely ignores my previous comment: There is absolutely no relationship between the file content's encoding and the executing thread's current locale. – IInspectable Jan 10 '16 at 04:12

Why Read Garbage Value Form File

1 Answers1