0

in Caesar (CS50) it says that i need to convert an ASCII character to alphabetical index in one of the steps. what does that mean? i saw a video that said that i "need to find the relationship between a number's ASCII value and its actual index in the alphabet", but i haven't really understood how I might implement this* and *what exactly is the relationship.

  • please elaborate in your answer because I'm new to this.
string plaintext = get_string("plaintext;");
Tom Carrick
  • 6,349
  • 13
  • 54
  • 78

3 Answers3

1

As you may or may not know ASCII characters are encoded as 8-bit values and character constants, in reallity, have int type in C.

Using this knowledge you can perform character arithmetic as if they are regular numbers, take the following example:

printf("%d\n", 'a');

This prints 'a''s int value which is 97.

Now this:

printf("%d\n", 'g' - 'a');

This will print 6 which is the result of 103 - 97.

Now your string:

const char* plaintext  = "plaintext";   

for(size_t i = 0; i < strlen(plaintext); i++){
    printf("%c - %d\n",plaintext[i], plaintext[i] - 'a' + 1);
}

The result:

p - 16
l - 12
a - 1
i - 9
n - 14
t - 20
e - 5
x - 24
t - 20

As you can see the printed results are the indexes of the letters in the alphabet 1...26, I added 1 to the result because, as you know, in C indexing starts at 0 and you would have 0...25.

So the bottom line is that you can use this character arithmetic to find the indexes of characters, this also aplies to caps, but you can't mix both.

Note that there are other character encodings that do not allow for this kind of arithmetic because the alphabetic characters are not in sequencial order, like, for example, EBCDIC.

anastaciu
  • 23,467
  • 7
  • 28
  • 53
0

It means that a single char variable is nothing but an integer containing an ASCII code, such as 65 for 'A'. It might be more convenient for an algorithm to work with the interval 0 to 25 than 65 to 90.

Generally, if you know that a char is an upper-case letter, you can do a naive conversion to alphabetical index by subtracting the letter 'A' from it. Naive, because strictly speaking the letters in the symbol (ASCII) table need not be located adjacently. For a beginner-level program, it should be ok though:

char str[] = "ABC";
for(int i=0; i<3; i++)
  printf("%d ", str[i] - 'A');  // prints 0 1 2

Wheras a 100% portable converter function might look something like this:

int ascii_to_int (char ch)
{
  const char LOOKUP_TABLE [128] = 
  {
    ['A'] = 0,
    ['B'] = 1,
    ...
  };
  
  return LOOKUP_TABLE[ch];
}
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    Re “strictly speaking the letters in the symbol (ASCII) table need not be located adjacently”: The letters in the ASCII table must be located adjacently. The letters in a character set used by a C implementation need not be located adjacently. – Eric Postpischil Jun 27 '20 at 14:23
0

Here you have an example. It is portable as it does not depend if the char encoding.

const char *alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ ";

int getIndex(const char *alphabet, int c)
{
    int result = -1;
    const char *res;

    res = strchr(alphabet, c);

    if(res)
    {
        result = res - alphabet;
    }
    return result;
}

int main(void)
{
    char *str = "Hello World!!!";

    while(*str)
    {
        printf("Index of %c is %d\n", *str, getIndex(alphabet, *str));
        str++;
    }
}

https://godbolt.org/z/rw2PK9

0___________
  • 60,014
  • 4
  • 34
  • 74
  • Might want to drop that `strchr` method in favour for a random access look-up table. Means 128 bytes table instead of 52 bytes table, but much faster execution. – Lundin Jun 26 '20 at 10:42
  • It is for illustrational purposes. Much easier to create and change the alphabet to play with. I don't think that OPs main problem is the performance :) – 0___________ Jun 26 '20 at 10:44
  • 1
    Indeed, but neither is more advanced matters like (EBCDIC) portability, so they most likely get away with `toupper(ch) - 'A'`. – Lundin Jun 26 '20 at 10:47