-1

I'm writing a function in C (not C++, this is going to run on an older computer), which should take an input char* and add spaces to it, based on letter capitalization and numbers, then return the result. I cannot use strings and their functions I'm afraid, due to platform limitations.

For example, the input "TestingThisPieceOfText" should be returned as "Testing This Piece Of Text".

I have some (rather crude for now) code that works for simple cases like this, but I would like to add some exceptions to the rule, and that's where I need help:

  • If multiple capital letters are in a sequence, they should not be separated by spaces. For example, "APB" should stay as-is.
  • If there are numbers in the input string, there should be a space before (and after), but not between them. For example, "A10TankKiller2Disk" should be returned as "A10 Tank Killer 2 Disk".
  • Special-case for "Mc", to cover cases where such names might be sent in. For example, "ScroogeMcDuckIsFilthyRich" should be returned as "Scroodge McDuck Is Filthy Rich".

Here's the function as it currently stands (like I said, a little crude for now):

char* add_spaces_to_string(const char* input)
{
    char input_string[100];
    strcpy(input_string, input);

    char* output = (char*)malloc(sizeof input_string * 2);

    const char capitals[] = "ABCDEFGHIJKLMOPQRSTVWXYZ";
    const char numbers[] = "1234567890";
    const char mc[] = "Mc";

    // Special case for the first character, we don't touch it
    output[0] = input_string[0];

    unsigned int output_index = 1;
    unsigned int capital_found = 0;
    unsigned int number_found = 0;
    for (unsigned int input_index = 1; input_string[input_index] != '\0'; input_index++)
    {
        for (int capitals_index = 0; capitals[capitals_index] != '\0'; capitals_index++)
        {
            if (capitals[capitals_index] == input_string[input_index] 
                && capital_found < input_index - 1
                && number_found < input_index - 1)
            {
                capital_found = input_index;
                //printf("Found a capital character (%c), in position %u. Adding a space.\n", input_string[i], i);
                output[output_index] = ' ';
                output_index++;
                output[output_index] = input_string[input_index];
            }
        }

        for (int numbers_index = 0; numbers[numbers_index] != '\0'; numbers_index++)
        {
            if (numbers[numbers_index] == input_string[input_index] 
                && capital_found < input_index - 1
                && number_found < input_index - 1)
            {
                number_found = input_index;
                output[output_index] = ' ';
                output_index++;
                output[output_index] = input_string[input_index];
            }
        }
        output[output_index] = input_string[input_index];
        output_index++;
    }
    output[output_index] = '\0';

    return output;
}

With the above, simple examples like

"AnotherPieceOfTextWithoutSpaces" 

are correctly converted to

"Another Piece Of Text Without Spaces"

but more complex ones, like

"A10TankKiller2Disk" 

are not - it returns

"A1 0Tank Killer 2Disk" 

in that case.

So the question is, why am I getting spaces in the positions I don't want, while not getting them where I want (Based on the rules I mentioned above)?

Any pointers to the right direction would be greatly appreciated! :)

MiDWaN
  • 35
  • 6
  • Can you use Flex&Bison? – Juan Aug 16 '18 at 08:55
  • I don't uderstand why "A10TankKiller2Disk" should be returned as "A10 Tank Killer 2 Disk" rather than "A 10 Tank Killer 2 Disk". Your specification concerning the handling of numbers looks incomplete. But anyway, the question is too broad. – Jabberwocky Aug 16 '18 at 09:01
  • Hint: I think you should rewrite the function from scratch and introduce the two helper functions (which you need to write) `int IsCapitalLetter(char c)` and `int IsDigit(char c)`. – Jabberwocky Aug 16 '18 at 09:04
  • The first character of the input string is always skipped (see related line in the sample code), that's why A10 Tank Killer 2 Disk should look like that ("A" will be the first character, and "1" will be the first number, so no spaces will be added). I'm looking at writing helper functions at the moment actually, thanks – MiDWaN Aug 16 '18 at 09:07
  • @MiDWaN If the sequence is "aaABbb" what is the result ? And "Mc10MM" ? AAMcMM" ? If you're having a real trouble to do your function, maybe start looking about "finite state machine". Even if you don't use it, it can help you to have a clean aproach. – Tom's Aug 16 '18 at 10:33
  • @Tom I got it working by rewriting the detection logic (using C functions for that), check my answer below ;) – MiDWaN Aug 16 '18 at 11:31

1 Answers1

0

Edit: The detection of capitals and digits was more complex than necessary. Also, it would go in the routine of adding a space at the incorrect position, when a digit followed another digit, and when a capital letter followed a digit.

I rewrote the function, using 2 helper functions that were supported - isdigit() and isupper(). This seems to make it work for now:

char* add_spaces_to_string(const char* input)
{
    char input_string[100];
    strcpy(input_string, input);

    char* output = (char*)malloc(sizeof input_string * 2);
    const char mc[] = "Mc";

    // Special case for the first character, we don't touch it
    output[0] = input_string[0];

    unsigned int output_index = 1;
    unsigned int input_index = 1;
    unsigned int capital_found = 0;
    unsigned int number_found = 0;
    while (input_string[input_index])
    {
        if (isdigit(input_string[input_index]))
        {
            if (number_found < input_index - 1)
            {
                output[output_index] = ' ';
                output_index++;
                output[output_index] = input_string[input_index];
            }
            number_found = input_index;
        }

        else if (isupper(input_string[input_index]))
        {
            if (capital_found < input_index - 1)
            {
                output[output_index] = ' ';
                output_index++;
                output[output_index] = input_string[input_index];
            }
            capital_found = input_index;
        }

        output[output_index] = input_string[input_index];
        output_index++;
        input_index++;
    }
    output[output_index] = '\0';

    return output;
}

I still need to add an special rule for the case of "Mc", but that's a minor issue for me and I'll add it later.

MiDWaN
  • 35
  • 6