0

On the question How to replace/ignore invalid Unicode/UTF8 characters � from C stdio.h getline()? I present a possible solution to the problem on the question, but I did not manage to get it working correctly.

This is the full example:

FILE* cfilestream = fopen( "/filepath.txt", "r" );
int linebuffersize = 131072;
char* readline = (char*) malloc( linebuffersize );
char* fixedreadline = (char*) malloc( linebuffersize );

int index;
int charsread;
int invalidcharsoffset;

while( true )
{
    if( ( charsread = getline( &readline, &linebuffersize, cfilestream ) ) != -1 )
    {
        invalidcharsoffset = 0;
        for( index = 0; index < charsread; ++index )
        {
            if( readline[index] != '�' ) {
                fixedreadline[index-invalidcharsoffset] = readline[index];
            } 
            else {
                ++invalidcharsoffset;
            }
        }
        std::cerr << "fixedreadline=" << fixedreadline << std::endl;
    }
    else {
        break;
    }
}

When I compile it, I got the following warning:

  $ x86_64-linux-gnu-gcc -g -O0 -Wall -ggdb -std=c++11 
  source/fastfile.cpp:512:44: warning: multi-character character constant [-Wmultichar]
                       if( readline[index] != '�' ) {
                                              ^~~~~

And when running the program, it does not remove the � character from the input string Føö�Bår.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
Evandro Coan
  • 8,560
  • 11
  • 83
  • 144
  • Do you know what character it is that you're trying to remove? That is, its byte value? – Nicol Bolas Jun 14 '19 at 20:48
  • I did not understand the question. The character I would like to remove is `�`. If this is not a character, I would like to remove whatever is causing this `�` character to show up in my `char*` string. – Evandro Coan Jun 14 '19 at 20:55
  • 2
    Question answered here: https://stackoverflow.com/questions/56604724/how-to-replace-ignore-invalid-unicode-utf8-characters-from-c-stdio-h-getline, where it was originally asked. – rici Jun 14 '19 at 21:11

0 Answers0