2

I'm learning C by reading K&R (ANSI edition), supplemented with 21st Century C. I'd say I'm already pretty confident with most of the fundamentals of pointers. That means I know you have to be very careful passing pointers out of a function that weren't either passed into it in the first place or malloced. So this example has me stumped. It's from §5.6, page 109:

#define MAXLEN 1000 /* max length of any input line */
int getline(char *, int);
char *alloc(int);

/* readlines: readinputlines */
int readlines(char *lineptr[], int maxlines)
{
   int len, nlines;
   char *p, line[MAXLEN];

   nlines = 0;
   while ((len = getline(line, MAXLEN)) > 0)
      if (nlines >= maxlines || (p = alloc(len)) == NULL)
         return -1;
      else {
         line[len-1]='\0'; /* delete newline */
         strcpy(p, line);
         lineptr[nlines++] = p;
      }
   return nlines;
}

strcpy() was previously defined as:

/* strcpy: copy t to s; pointer version 3 */
void strcpy(char *s, char *t)
{
   while (*s++ = *t++)
      ;
}

I don't understand how the memory pointed to by p can remain in scope once the function returns. Here's what I do understand. The pointer p is declared within the function - so it's in automatic memory. As for what it's pointing to, it's not explicitly assigned any memory. Then when strcpy is called, values are copied from line to wherever p points to.

I've also reached the point in 21st Century C which discusses the dangers of pointers, and gives examples that attempting to pass an array declared in automatic memory out of a function is definitely not OK. (p. 109: "A pointer to a block of memory that has already been automatically freed is worse than useless.") So the only way I can understand the above code being valid is that by declaring p not as an array where a memory block is explicitly allocated in automatic memory, but rather as a pointer, its memory allocation is somehow implicitly handled in some other way. Is that correct? How does the memory allocation work?

Note: By reading 21st Century C I'm attempting to neutralise whatever bad practices K&R might introduce me to. I understand that it's not the best standard to go by, and probably I will never end up writing code in the way demonstrated in this example. But I would still like to understand it.

Igid
  • 515
  • 4
  • 15
  • Tangentially to your question's concerns — Be aware that modern POSIX defines [`getline()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html) which has a different interface altogether from the version in K&R. You can easily rename it; you'll need to rename it if you wish to avoid compilation problems in general on modern systems. It is a problem of the rest of the world changing around the book. – Jonathan Leffler Feb 06 '18 at 15:28
  • 2
    Note the `p = alloc(len)`. – user202729 Feb 06 '18 at 15:30
  • @JonathanLeffler Thanks. It's true I'm not trying to compile these examples, and I don't expect I'll ever be writing my own `getline()` function, if I learn anything from _21st Century C_ – Igid Feb 06 '18 at 15:31
  • (some people think that assignment within other statements can be dangerous, so Python can't) – user202729 Feb 06 '18 at 15:31
  • Also note that `||` has short-circuiting behavior. – user202729 Feb 06 '18 at 15:31
  • @user202729 Indeed. It took me writing up this entire question to spot it! – Igid Feb 06 '18 at 15:38
  • @user202729 Yeah some people noted that it was dangerous practice in the early 1980s. Too late, the book was already in print. The problem was delightfully solved in 1989, with the Turbo C compiler giving a diagnostic for such assignments. In general, the style has been mostly abandoned since long. – Lundin Feb 06 '18 at 15:40

3 Answers3

5

Indeed, the pointer p itself resides in automatic memory. However, it is assigned a memory block during the evaluation of the condition of the following if statement:

if (nlines >= maxlines || (p = alloc(len)) == NULL)

More specifically, if the first term of the condition is false (since the or logical operator is a short-circuit one), the second term is evaluated which performs a memory allocation and checks if that succeeded.

As per my old knowledge of K&R, the alloc function is a simplified version of malloc, defined somewhere before you got to this example. However, the point of memory allocation functions, such as the standard malloc, exemplified here by alloc is to allow you to reserve memory on the heap. The heap is a memory region defined during the execution of the program which is not automatically managed, i.e. the compiler does not do any operations on it unless you explicitly tell it to do. Thus, you can return/pass by value the pointer p. The memory of the variable itself is not of much use; it's value is what matters. It points to an allocated memory region, which is going to stay there for as long as you want (until you explicitly free it). Just make sure you won't lose the memory address of this region before you free it.

Paul92
  • 8,827
  • 1
  • 23
  • 37
  • 1
    Thank you, yes that's exactly right. I confess... I saw the answer pretty much at the exact moment I was clicking 'Post Question'. But I thought it would be useful for this to exist as reference for someone in the same pickle. **K&R seem to _looove_ conditionals with side-effects.** This was my first object lesson in what an utter headache it can be to bury important assignments like that. – Igid Feb 06 '18 at 15:37
  • 1
    Indeed, it not the most readable. code. On the other hand, it develops your skills to read code like this, and trust me, most people don't write the most readable code. – Paul92 Feb 06 '18 at 15:38
  • 1
    Imagine if K&R would have [practised what they preached](http://www.azquotes.com/picture-quotes/quote-debugging-is-twice-as-hard-as-writing-the-code-in-the-first-place-therefore-if-you-write-brian-kernighan-66-91-06.jpg). This lovely quote soundly dismisses almost every single code example in _The C Programming Language_. – Lundin Feb 06 '18 at 15:54
1

p is a local variable that contains a value that is an address of some memory area allocated on the heap. Heap is not a local memory, not in a stack frame. Heap-allocated objects are never automatically collected in the classical C and thus they persist and outlive any calls and returns, until being manually freed. Please do not confuse heap-allocated objects with local arrays that reside in stack, those should not really be passed out.

bipll
  • 11,747
  • 1
  • 18
  • 32
  • 1
    Note that the C standard does not define things in terms of heaps or stacks. When explaining C, more accurate terms are “allocated storage” and “automatic storage.” – Eric Postpischil Feb 06 '18 at 15:32
  • Sure, thank you. To me, these are well-known colloquial terms that do describe the situation somehow appropriately. – bipll Feb 06 '18 at 15:35
0

Side notes shall anyone attempt to compile the aforementioned example : in

while ((len = getline(line, MAXLEN)) > 0)
  • the function getline() is indeed the one defined in § 1.9 of the same book, and not the one included in modern iterations of stdio.h

  • the > 0 here should in fact be > 1 , as the K&R version of getline() returns a length of 1 for the empty string, that is the character array { '\0' }

Also,

if (nlines >= maxlines || (p = alloc(len)) == NULL)

works with the standard malloc, too.

Éric Viala
  • 596
  • 2
  • 12