I am recreating the entire standard C library and I'm working on an implementation for strle
n that I would like to be the basis of all my other str
functions.
My current implementation is as follows:
int ft_strlen(char const *str)
{
int length;
length = 0;
while(str[length] != '\0' || str[length + 1] == '\0')
length++;
return length;
}
My question is that when I pass a str
like:
char str[6] = "hi!";
As expected, the memory reads:
['h']['i']['!']['\0']['\0']['\0']['\0']
If you look at my implementation, you can expect that I would get a return of 6 - as opposed to 3 (my previous approach) so that I can check strlen
potentially including extra allocated memory.
The catch here is that I will have to read outside of initialized memory by 1 byte to fail my last loop condition at final null terminator - which is the behavior I WANT. However this is generally considered bad practice and by some an automatic error.
Is reading outside of your initialized value a bad idea even when you are very specifically intending to read into a junk value (to ensure it DOES NOT contain '\0')?
If so, why?
I understand that:
"buffer overruns are a favorite avenue for attacking secure programs"
Still, I can't see the problem if I'm simply trying to ensure I've hit the end of initialized values...
Also, I realize this problem can be avoided - I have already sidestepped with a value set to 1 and then only reading initialized values - that's not the point, this is more of a fundamental question about C, runtime behavior and best practices ;)
[EDITS:]
Comment to previous post:
OK. Fair enough - but as to the question "Is it always a bad idea (danger from intentional manipulation or runtime stability) to read after initialized values" - do you have an answer? Please read the accepted answer for an example of the nature of the question. I really don't need this code fixed, nor do I need a better understanding of data types, POSIX specs or common standards. My question is related to WHY such standards may exist - why it may be important to never read past initialized memory (if such reasons exist)? What is the potential fallout of reading past initialized values IN GENERAL?
Please all - I'm trying to better understand aspects of how systems operate and I have a VERY SPECIFIC question.