0

This is a pretty loosely fitting title I'll admit.

I'm doing an exercise where we are tasked with creating a program that removes trailing whitespace and deletes blank lines. One of the solutions posted online received praise but, when I run the code, it doesn't even seem like it does anything. Here is the solution I'm referring to, with a brief excerpt that I'm confused by

The program specification is ambiguous: does "entirely blank lines"
mean lines that contain no characters other than newline, or does it include lines composed of blanks and tabs followed by newline? The latter interpretation is taken here.

So he's saying that the code should be removing blank lines but I don't think it does, at least not when I run it. And I don't know how you would enter a new line as input because when you press enter it doesn't skip a line, it takes your input and continues to run the program. He goes on to mention "blanks and tabs followed by a newline" but again...what is newline when enter just runs the program? And if you try to leave lines blank just by pressing a tab or space a bunch, the program doesn't delete those lines, it prints out the same blank line, so I really don't know what his program is doing or what he's talking about.

#include <stdio.h>
#include <stdlib.h>

#define MAXQUEUE 1001

int advance(int pointer)
{
  if (pointer < MAXQUEUE - 1)
    return pointer + 1;
  else
    return 0;
}

int main(void)
{
  char blank[MAXQUEUE];
  int head, tail;
  int nonspace;
  int retval;
  int c;

  retval = nonspace = head = tail = 0;
  while ((c = getchar()) != EOF) {
    if (c == '\n') {
      head = tail = 0;
      if (nonspace)
        putchar('\n');
      nonspace = 0;
    }
    else if (c == ' ' || c == '\t') {
      if (advance(head) == tail) {
        putchar(blank[tail]);
        tail = advance(tail);
        nonspace = 1;
        retval = EXIT_FAILURE;
      }

      blank[head] = c;
      head = advance(head);
    }
    else {
      while (head != tail) {
        putchar(blank[tail]);
        tail = advance(tail);
      }
      putchar(c);
      nonspace = 1;
    }
  }
  return retval;
}

Apart from running the code, I've also stepped through it with the debugger and I find it kind of confusing but I don't even want to get into that until I can figure out what it's even trying to accomplish or why it's being praised when it seemingly doesn't accomplish anything.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
  • 3
    *I don't know how you would enter a new line as input*. You may be getting confused by the terminal's echo. It will be clearer for you if you put your test input into a file and then redirect the file as stdin into the program. For example on Linux: `./my_executable < my_test_file.txt`. Here's another example of using an online compiler, look at the stdin and stdout sections at the bottom: https://ideone.com/9hy42K. – kaylum Feb 20 '21 at 03:11
  • 1
    This `advance()` function is really perplexing. – tadman Feb 20 '21 at 03:26
  • 2
    I'm not sure what's to praise in the posted code. My first impression is that it's a fine example of code that looks superficially reasonable, but is actually too clever, and difficult to read because of poorly-chosen names and no comments. In production environments when you see code like this you sarcastically say, "job security". (It's not *that* bad, but it took longer to understand than it should have.) – Steve Summit Feb 20 '21 at 11:20
  • 1
    How to understand it: in any text-processing task like this, you have a choice between reading and processing input a line at a time or a character at a time. This problem would be easier to solve by reading a line at a time, but then you have the problem of mishandling input with lines longer than some fixed-size line buffer. (There are ways around that, too, but they're more work.) Reading a character at a time is good, because it lets you handle arbitrarily-long lines, *but* it gives you tunnel vision; you can't look ahead or behind in the input to make any decisions. – Steve Summit Feb 20 '21 at 11:25
  • 2
    The problem is that when you read a space or tab in isolation, you don't know if it will be followed by a printing character. You can't output it, but you can't safely discard it, so you have to temporarily save it in a buffer or queue. When you find a printing character, you print any queued-up whitespace characters, but if you find a newline, that's when you discard the queued-up characters (and perhaps the newline itself). – Steve Summit Feb 20 '21 at 11:28
  • 1
    The confusing part is that this program uses, for no good reason, a [*circular buffer*](https://en.wikipedia.org/wiki/Circular_buffer) to store the queued-up whitespace characters. Circular buffers have their uses, but they can be confusing and difficult to get right. This program uses an oddly-named auxiliary function to help with one part of the buffer management. – Steve Summit Feb 20 '21 at 11:30
  • 1
    But anyway, the program's algorithm is: *Read characters until EOF. For each character: If it's a whitespace character, enqueue it. If it's a newline, and if we saw any characters on this line, output it. Finally, if it's any other character, first output any enqueued whitespace, then output the character, and set a flag recording that we saw a nonblank character on this line.* – Steve Summit Feb 20 '21 at 11:36
  • 1
    If you're on a system with a Unix-like command line, you can use `cat -e` on the input or output file, or pipe the program's output into `cat -e`, to mark the line endings visibly, to make it easy to see if the program is doing its job correctly. – Steve Summit Feb 20 '21 at 11:40
  • 1
    I think @SteveSummit is being too kind. This code is bone jarringly absurd. There is absolutely no benefit to using a circular buffer with a fixed size instead of `fgets`. – William Pursell Feb 20 '21 at 12:42
  • 1
    @TannerChrishop If kaylum's first comment didn't help, if you're still having trouble understanding how the program is reading and processing characters (and how your experience might be confounded by the OS's keystroke editing), you might want to read [these course notes](https://www.eskimo.com/~scs/cclass/notes/sx6b.html), specifically the paragraph beginning "Finally, don't be disappointed". – Steve Summit Feb 20 '21 at 13:39
  • @SteveSummit thank you, this link to the lecture notes is amazing. I'm going to read through the whole course because I'm finding the book quite difficult, or at least the exercises. – TannerChrishop Feb 20 '21 at 18:47
  • @kaylum Very useful command, thanks. – TannerChrishop Feb 23 '21 at 00:37

0 Answers0