1

I can't figure out this error.

Problem > Write a function to find the longest common prefix string amongst an array of strings. If there is no common prefix, return an empty string "".

char * longestCommonPrefix(char ** strs, int strsSize){
    int i = 1;
    int j = 0;
    int k = 1;
    int n = 0;
    
    if (strsSize == 0) return ("");
    if (strsSize == 1) return (strs[0]); 
    if (strsSize > 1 && strcmp(strs[0],"") == 0) return ("");
    
    char *s = calloc(strlen(strs[0]), sizeof(char));
    char *temp = calloc(strlen(strs[0]), sizeof(char));
    if (!s || !temp) return (0);
    
    // while (strs[0][j] != '\0') {
    //     temp[j] = strs[0][j];
    //     j++;
    // }
    strcpy(temp, strs[0]);
    
    while (k < strsSize) {
        j = 0;
        n = 0;
        memset(s, 0, strlen(s));
        while (strs[i][j]){
            if (temp[j] == strs[i][j]) {
                s[n] = strs[i][j];
                n++;
            }
            else if (temp[j] != strs[i][j]){
                break;
            }
            j++;
        }
        strcpy(temp, s);
        k++;
        i++;
    }
    return (s);
}

It works without any problem on my visual studio but when I submit this code to Leetcode, it always has that error message. I assume error occurs on either line 7 or 8 because it dose not show same error when I hide those two line of codes. Also, when I copy strs[0] string to temp using while loop, it works fine on both Leetcode and visual studio but it does not work when I use "scrcpy" function on Leetcode but again, it works on my visual studio.

Another minor problem. The reason that I use calloc to allocate the memory is because It shows same error message "Address sani~" when using malloc. It does not matter on visual studio. Both calloc and malloc work ok.

I've added an extra memory for NULL terminator.

char *s = malloc((strlen(strs[0]) + 1) * sizeof(char));

It seems alright in my head but It has same memory issue.

I've tried many different ways but I still have issue with the way I use malloc and clear a string.

int len = strlen(strs[0]);
char *temp = malloc((len+1) * sizeof(char)) 
//it still gives me a memory error. 

When I clear my string with memset, it returns a correct output but it does not return the right answer when I clear the string simply using NULL terminator. Is there a difference?

s[j] = '\0';
memset(s, 0, strlen(s));
Hans
  • 49
  • 1
  • 7
  • 2
    You're not allocating enough bytes for the string _and_ its NUL-terminator byte. As a result, let's say your string is `"hello"` ... well, you allocate 5 bytes and store this pointer in `temp`, then you call `strcpy` which copies 6 bytes into that memory, which overflows it. The address sanitizer rightly detects this overflow. You should get in the habit of using `strcpy_s`, which forces you to understand how much memory you actually have to work with. – paddy Feb 10 '22 at 06:08
  • @paddy Thank you! It's immediately fixed when I change it to "char *s = calloc(strlen(strs[0])+1, sizeof(char))". But when I try "char *s = malloc((strlen(strs[0]) + 1)*sizeof(char))", it seems alright in my head but still get the same error. +1 for a NULL terminator! – Hans Feb 10 '22 at 07:00
  • Welcome to StackOverflow! Please take the [tour] to learn how this site works. -- `malloc()` does not initialize the memory. So the space can contain anything, and the later `strlen(s)` returns a generally random number, probably greater than you allocated. -- BTW, please [edit] your question to add new information. This is not a forum. – the busybee Feb 10 '22 at 07:49
  • @thebusybee Hi! I don't understand why strlen(s) would return a random number because it returns the length of a string. I run it to see what it returns and it returned the length. It's my first time posting a question stackoverflow and I don't know how I should edit my questions by adding new information? – Hans Feb 10 '22 at 07:57
  • There is no valid string where `s` points to. A string is defined as a nul-terminated character sequence. Without you adding the terminator byte it is only a "random" sequence of characters. If the next location of a 0 byte in memory is 200MB away, `strlen` will happily walk all the way through your memory. – Gerhardh Feb 10 '22 at 07:58
  • See these grey words below your question? One of them is "Edit", and they are all links. -- There are plenty of pages in the [help]. – the busybee Feb 10 '22 at 08:00
  • `strlen` does not know anything about how you allocated memory -- it simply starts looking at the first address, and counts how many bytes occur until it encounters the terminator. The difference between `calloc` and `malloc` is that one initializes all the bytes to zero, and the other does no initialization. So, if you allocate `s`, never initialize it with anything and then try to clear it via `memset` using `strlen(s)`, bad stuff happens. You actually don't need to use `memset` at all here. Simply `s[n] = '\0'` just after your inner while-loop is enough to complete the string. – paddy Feb 10 '22 at 08:03
  • One other point is that `else if (temp[j] != strs[i][j])` is duplicating the logic of the original `if` and should instead just be `else` to assist readability. Actually, using the branching and breaking here is too much. Your whole loop could be reduced to `while (strs[i][j] && temp[j] == strs[i][j]) s[n++] = strs[i][j++];` – paddy Feb 10 '22 at 08:09
  • @paddy Hi Paddy and I appreciate your help a lot. You're right. if-else and break statement was unnecessary. When I clear a string this way s[0] = '\0', I think there's still garbage value which makes function returns unexpected output. So I just use memset function and specify bytes size 'memset(s, 0, strlen(temp))'. Still while calloc is working fine, malloc is not working well. Do you think It's because malloc does not initialize? I'm still working on it. Thank you – Hans Feb 10 '22 at 10:09
  • Your code relies on the entire string buffer being zeroed, because you're not terminating it after the loop like I recommended. Because you're using the string length in that memset call, you're writing 1 byte too few, so you essentially have the same problem as you did at the beginning. All you do there is zero all the values that you're about to overwrite in the loop that follows, but leaving the most important byte uninitialized (if you use malloc). – paddy Feb 10 '22 at 11:04
  • @paddy Hi, paddy. I'm still working on the same issue. If you don't mind, would you be okay to post your answer instead of comment section, so that I can give likes? I think it will also help me post a new question! :) – Hans Feb 11 '22 at 04:59

1 Answers1

2

The main difference between malloc and calloc is that malloc only reserves memory but doesn't modify its contents, whereas calloc reserves memory and zero-initializes every byte.

So, what's happening inside your loop where you begin copying string prefixes into s is you did this:

memset(s, 0, strlen(s));

Now, you thought that you just cleared the string, but at this point it's the first time you've done anything with s at all. So if s was allocated with malloc then calling strlen(s) could do anything. Very likely the uninitialized memory contents will not be a NUL-terminated string, and strlen will happily search past the end of that block of memory and into other parts of memory that your program did not request.

This is known as Undefined Behavior.

So, what can you do about it? Well, in the comments you tried this:

s[0] = '\0';

Great! But now every other byte of the string is not initialized. You're about to loop over strs[i] and copy stuff into s. The problem is that afterwards you never copied in a NUL-terminator, so once again you have an un-terminated string. And, to be specific, the place where that terminator would have gone is still uninitialized.

As I already said very early on in comments, all you need to do is write that terminator after copying. You don't really need two variables j and n because they're going to be the same value anyway. Let's just use j. It makes the code simpler.

j = 0;
while (strs[i][j] && temp[j] == strs[i][j])
{
    s[j] = strs[i][j];
    j++;
}
s[j] = '\0';

That's all. You have a terminated string. You didn't need to pre-initialize it, because you were about to overwrite that memory anyway. The important part is the line after the loop.


Now, let's think a bit about what you're really trying to achieve here.

The task is to search all strings in the array and find the longest common prefix. And looking at your code, not only is it doing too much work, it's actually not even achieving the required task. Hint: you never used the variable k.

So why not take a step back and think about it differently? Do you actually need to copy prefixes over and over? No, you don't. All you need to do is see how many characters they have in common. And, you don't even need to check all the characters -- you only need to check up to whatever maximum count you've already found.

// Find longest common prefix
int commonPrefixLength = strlen(strs[0]);
for (int i = 1; i < strsSize; i++)
{
    int p = 0;
    while (p < commonPrefixLength && strs[i][p] == strs[0][p]) ++p;
    commonPrefixLength = p;
}

// Copy prefix to a new string
char *commonPrefix = malloc(commonPrefixLength + 1);
memcpy(commonPrefix, strs[0], commonPrefixLength);
commonPrefix[commonPrefixLength] = 0;

That's all you need to do. You don't even need to check for the NUL terminator when searching the strings because you've already bounded the search by the longest prefix and it's guaranteed that if you encounter a NUL-terminator then it won't be equal to the character you're testing in strs[0]. Nice.

However we have another issue, and that is your function is not consistent about what it returns. You wrote some special cases that might return strs[0] or a string literal "". Otherwise it allocates new memory and returns that. This is a problem, because the caller does not know whether it is supposed to free this memory or not.

So you should decide. The options are:

  1. Function either allocates or returns NULL -- caller always responsible for freeing memory.
  2. Function never allocates -- returns NULL or modifies str[0] and returns that

If you choose 2, then instead of allocating and copying stuff after the loop, you'd just terminate the string. However, you can only do this if input array was not string literals:

strs[0][commonPrefixLength] = '\0';

So the safest choice is probably option 1. But here are both...

Option 1:

char * longestCommonPrefix(char ** strs, int strsSize)
{
    if (strsSize == 0)
        return NULL;

    int commonPrefixLength = strlen(strs[0]);
    for (int i = 1; i < strsSize; i++)
    {
        int p = 0;
        while (p < commonPrefixLength && strs[i][p] == strs[0][p]) ++p;
        commonPrefixLength = p;
    }

    char *commonPrefix = malloc(commonPrefixLength + 1);
    if (commonPrefix != NULL)
    {
        memcpy(commonPrefix, strs[0], commonPrefixLength);
        commonPrefix[commonPrefixLength] = 0;
    }
    return commonPrefix;
}

Option 2:

char * longestCommonPrefix(char ** strs, int strsSize)
{
    if (strsSize == 0)
        return NULL;

    int commonPrefixLength = strlen(strs[0]);
    for (int i = 1; i < strsSize; i++)
    {
        int p = 0;
        while (p < commonPrefixLength && strs[i][p] == strs[0][p]) ++p;
        commonPrefixLength = p;
    }

    strs[0][commonPrefixLength] = '\0';
    return strs[0];
}

Live demo: https://godbolt.org/z/bebs7zz6j

paddy
  • 60,864
  • 6
  • 61
  • 103
  • WOW. Paddy you rock. After terminating the string, It magically works. I knew that a string should be ended with null terminator but apparently when it comes to solve a real problem, the knowledge that I thought I had was gone. You're right. There are unnecessary variables that make code look only clutter. I didn't have to declare and allocate two separate strings and copy etc. I feel like I can finally breathe after having stuffy a nose for 2 days. Thank you for your teaching :) – Hans Feb 11 '22 at 06:54
  • You're welcome. If you read back through the comments from yesterday, you'll realize that I told you how to terminate the string in my second comment, but I guess you didn't quite grasp what "after the inner while-loop" means. If this answer helped you, it would be courteous to upvote and accept it, especially since you requested I write the answer in the first place. – paddy Feb 11 '22 at 07:59
  • yeah I did not understand "After the inner while loop" part. I can't upvote right now because it says that I need at least 15 reputation to do so. I will definitely come back to upvote it once I get 4 more reputation. Thank you for not only answering my question but also showing me a better and more efficient way of code. – Hans Feb 11 '22 at 21:15