2

I am creating a program that takes a text file, and a word. It reads a text file and compares the word to the current word being read. The way I built it was to ignore the "'s". For example if I am searching "NASA" and I came across the word "NASA's", it would just read it as "NASA". Unfortunately I can't catch the apostrophe by using "if (ch == ''\')", "ch" being the current char. I reason being maybe because the apostrophes in the text file look like this ’. I don't know how to accommodate for that.

Here's what I have to catch an apostrophe:

else if (ch == '\'')
                {

#if 1           
                    printf("Comparing Apostrophe\n");
#endif
                    continue;

                }

Please note that it is part of a bigger program, which I am not allowed to show you. This explains the "else if" ion the beginning.

Thanks.

IC2D
  • 471
  • 4
  • 11
  • 22
  • To check what kind of apostrophes you have in the text, you can copy the "NASA's" part to a new text file and write a small program that shows you ASCII codes of every letter. – PiotrK Oct 07 '13 at 17:08
  • 1
    @PiotrK Why *write* a program to do that? Just use a regular hex editor. I'm a fan of HxD, but any hex editor would work. – neminem Oct 07 '13 at 17:13

3 Answers3

3

If your apostrophes look like '’', then you should just check for the '’' character, it won't do that for you. That's a different character - looks like extended ASCII character 146 (decimal), "right single quotation mark". The text probably came from MS Office - it loves to convert single and double quotes into single and double so-called "smart" quotes (which, if you're doing anything with the text outside of Office, are anything but.) And as has been pointed out, the "left single quotation mark", ascii character 145, is the other half of that same set, so there might be some of those in your text as well. I.e.:

else if (ch == '\'' || ch == '’' || ch == '‘')
{
    [do stuff]
}
neminem
  • 2,658
  • 5
  • 27
  • 36
  • 1
    +1. I would also check for a left single quote (decimal 145) just because sometimes they're used (in appropriately). – lurker Oct 07 '13 at 17:17
1

For wide character wchar_t add the L


else if (ch == L'\'' || ch == L'’' || ch == L'‘')
{
    [do stuff]
}
0

Instead of using escape character you might use ascii number for apostrophe, which is 39: else if (ch == 39)

Igor Popov
  • 2,588
  • 17
  • 20