On the question How to replace/ignore invalid Unicode/UTF8 characters � from C stdio.h getline()? I present a possible solution to the problem on the question, but I did not manage to get it working correctly.
This is the full example:
FILE* cfilestream = fopen( "/filepath.txt", "r" );
int linebuffersize = 131072;
char* readline = (char*) malloc( linebuffersize );
char* fixedreadline = (char*) malloc( linebuffersize );
int index;
int charsread;
int invalidcharsoffset;
while( true )
{
if( ( charsread = getline( &readline, &linebuffersize, cfilestream ) ) != -1 )
{
invalidcharsoffset = 0;
for( index = 0; index < charsread; ++index )
{
if( readline[index] != '�' ) {
fixedreadline[index-invalidcharsoffset] = readline[index];
}
else {
++invalidcharsoffset;
}
}
std::cerr << "fixedreadline=" << fixedreadline << std::endl;
}
else {
break;
}
}
When I compile it, I got the following warning:
$ x86_64-linux-gnu-gcc -g -O0 -Wall -ggdb -std=c++11
source/fastfile.cpp:512:44: warning: multi-character character constant [-Wmultichar]
if( readline[index] != '�' ) {
^~~~~
And when running the program, it does not remove the � character from the input string Føö�Bår
.