0

I have an in-memory array created with int arraydata[100] and I want to sort it using the Linux sort command. Ideally I would like a simple one-line solution that looks like system("ls -lh >/dev/null 2>&1"); where I pipe the array to the appropriate fd and run sort on it, but the examples I have seen use named files like file.txt.

The question of how to run Linux sort from C was discussed in How to call UNIX sort command on data in pipe, but in that question he was forking a new process, which I do not want to do.

Can I run sort with a simple one-line program using system()? If not, what would be the lowest-overhead way to accomplish running sort on my array?

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
RTC222
  • 2,025
  • 1
  • 20
  • 53
  • 4
    Do you mean [`qsort()`](https://en.cppreference.com/w/c/algorithm/qsort) or do you mean `sort` as in the sort *command*? If you're doing C you can read directories without having to open external commands. – tadman Aug 22 '23 at 18:23
  • No, I want to use the Linux sort command. – RTC222 Aug 22 '23 at 18:31
  • 2
    What is linux `sort()`. There is sort command, in which case there are no `()`, it's not a function, a command. – KamilCuk Aug 22 '23 at 18:32
  • 1
    When you say `sort()` you mean, specifically, a function. Saying `sort` as in command is a whole different thing. In the POSIX world there is a very strong convention in use here to differentiate the two things, as otherwise there's endless confusion as many commands have function counterparts with the same name. – tadman Aug 22 '23 at 18:32
  • 4
    Really? Why in the world would you sort in-memory data via the external `sort` command? – John Bollinger Aug 22 '23 at 18:32
  • I want to use sort. Sorry for my misuse of (). See https://www.man7.org/linux/man-pages/man1/sort.1.html – RTC222 Aug 22 '23 at 18:33
  • 1
    You want to sort the array __or__ you want to _output_ a sorted array? The first one is _a lot_ more complex if using `sort` command then the latter. – KamilCuk Aug 22 '23 at 18:34
  • 3
    To run your data through the `sort` command, you absolutely *will* fork. If you use `system()` to run the command then the fork will just be inside that. – John Bollinger Aug 22 '23 at 18:34
  • 7
    Why do you want to use the `sort` command and not the `qsort` routine? This is an important part of the question because, if the answer is, say, that you have a school assignment requiring `sort`, then the assignment is to teach you about features like pipes and commands, and the assignment specifics should be stated, and the answer needs to speak to those issues. On the other hand, if you do not have such a specific assignment, then the answer is it is an absurdly wrong approach to seek to use the `sort` command to sort in-memory data, and you ought to use `qsort` or, preferably, `qsort_r`. – Eric Postpischil Aug 22 '23 at 18:35
  • Is it a mistake because there is too much overhead involved? – RTC222 Aug 22 '23 at 18:36
  • 4
    Using `sort` has a massive amount of overhead compared to calling `qsort_r`. It is like the difference between driving to the store for flour instead of using what is in your pantry. – Eric Postpischil Aug 22 '23 at 18:36
  • 2
    There is an incredible amount of overhead involved in spawning and reading input and output of an external program compared to simple memory operations required to sort. – KamilCuk Aug 22 '23 at 18:37
  • @Eric Postpischil - thanks very much for that. I will look into using qsort_r. When I want to sort a large file, then it looks like sort would be appropriate, but not for a small amount of data. – RTC222 Aug 22 '23 at 18:38
  • 2
    And aside from *performance* overhead, sending your data out for an external `sort` run is a lot more complicated code-wise. Maybe not Rube Goldberg territory, but heading in that direction. – John Bollinger Aug 22 '23 at 18:38
  • One more question. Is there a file size threshold over which sort would be better / faster than qsort_r? – RTC222 Aug 22 '23 at 18:40
  • 2
    `Is there a file size` what file? You have `int arraydata[100]`, it is in memory, there are no files. There are many sorting algorithms, the answer if _your program_ will be faster than some _other_ program called `sort` just depends on the programs contents. What if `sort` command just calls `qsort_r`? It's a too broad question. – KamilCuk Aug 22 '23 at 18:41
  • 3
    @RTC222, if you already have the data in memory as an array, then no, the `sort` command is strictly inferior to `qsort()` and `qsort_r()` for sorting it. – John Bollinger Aug 22 '23 at 18:41
  • No, I mean if I did have a file to sort, which is much more data. This question is not entirely academic -- it's just in the current situation I have in-memory data. But when I do have a large file, which would be better, sort of qsort? – RTC222 Aug 22 '23 at 18:42
  • 3
    It's quite possible that `sort` is implemented with `qsort` under the hood. The use of `qsort` is essential learning anyway. – Weather Vane Aug 22 '23 at 18:43
  • 1
    If you already have the data in a file, and you do not intend to load it into an array in a C program, then `qsort` is not really an option in the first place. – John Bollinger Aug 22 '23 at 18:44
  • 4
    @RTC222 It's all about using the right tool for the job. If you have an array of numbers to sort, `qsort()` is the right choice for all the reasons given. To use `sort`, you'll essentially be writing those numbers into a file (a pipe is a kind of file) and then starting another process and then reading the sorted numbers from another file. If the numbers are *already* in a file, then `sort` is likely the better choice because it already does all that reading an writing. If you want good advice, you really need to edit the question to provide some clarity around your *actual* goal. – Caleb Aug 22 '23 at 18:49
  • 2
    Your link explains what `sort` does, it willl "sort lines of text files". If you want to sort lines of text files, then it's the perfect tool for the job. If you want to do something else, it's not the right tool for the job. Is your array a text file whose lines you want to sort? If not, then you want a different tool. – David Schwartz Aug 22 '23 at 19:20

2 Answers2

2

what would be the lowest-overhead way to accomlish calling sort [command] on my array?

To answer your question, the following program does the "standard" pipe + fork + dup2 + exec to spawn a sort program to communicate with. Then the parent fdopens the pipes to printf/scanf on them. Finally, parent waits for child program termination. The program is nowhere near perfect, for many reasons, including using assert for cheap error handling.

#include <assert.h>
#include <math.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>

void sort_them_with_sort_command(size_t arrlen, int arr[arrlen]) {
  int in[2];
  int err = pipe(in);
  assert(err == 0);
  int out[2];
  err = pipe(out);
  assert(err == 0);
  int pid = fork();
  assert(pid >= 0);
  if (pid == 0) {
    close(in[1]);
    dup2(in[0], STDIN_FILENO);
    close(out[0]);
    dup2(out[1], STDOUT_FILENO);
    const char *const cmd[2] = {"sort", "-n"};
    execvp(cmd[0], (char *const *)cmd);
    assert(0);
  }
  close(in[0]);
  FILE *inf = fdopen(in[1], "w");
  assert(inf != NULL);
  close(out[1]);
  FILE *outf = fdopen(out[0], "r");
  assert(outf != NULL);
  for (size_t i = 0; i < arrlen; ++i) {
    err = fprintf(inf, "%d\n", arr[i]);
    assert(err > 0);
  }
  fclose(inf);
  close(in[1]);
  for (size_t i = 0; i < arrlen; ++i) {
    err = fscanf(outf, " %d", &arr[i]);
    assert(err == 1);
  }
  fclose(outf);
  close(out[0]);
  waitpid(pid, 0, 0);
}

int main() {
  int arraydata[10];
  const size_t arraydatalen = sizeof(arraydata) / sizeof(*arraydata);
  for (size_t i = 0; i < arraydatalen; i++) {
    arraydata[i] = rand();
  }
  printf("Before sorting:\n");
  for (size_t i = 0; i < arraydatalen; i++) {
    printf("%d\n", arraydata[i]);
  }
  sort_them_with_sort_command(arraydatalen, arraydata);
  printf("After sorting:\n");
  for (size_t i = 0; i < arraydatalen; i++) {
    printf("%d\n", arraydata[i]);
  }
  return 0;
}

Program outputs:

Before sorting:
1804289383
846930886
1681692777
1714636915
1957747793
424238335
719885386
1649760492
596516649
1189641421
After sorting:
424238335
596516649
719885386
846930886
1189641421
1649760492
1681692777
1714636915
1804289383
1957747793

Can I run sort with a simple one-line program using system()?

As system() executes another process with no any means of communication, you would have to write a means of communication between processes. You could open a pipe and then write from the shell to one part of the pipe and read in a separate thread in C from the other side of the pipe. Either way, this is way overkill for sorting.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • 2
    Note that in general, using two pipes to connect to another process like this requires care, lest it deadlock when the pipe buffers fill up. – Chris Dodd Aug 22 '23 at 22:22
1

Supplemental to @KamilCuk's answer, here's the qsort alternative to their sort_them_with_sort_command() function:

int compare(const void *x, const void *y) {
    int xval = *(const int *)x;
    int yval = *(const int *)y;

    if (xval == yval) return 0;
    return xval < yval ? -1 : 1;
}

void sort_them_with_qsort(size_t arrlen, int arr[arrlen]) {
    qsort(arr, arrlen, sizeof(arr[0]), compare);
}

(qsort_r() confers no advantage over qsort() in this case.)

So to reiterate one of my comments on the question, why in the world would you use the sort command if you're already poised to use qsort()? Look how much simpler the qsort() alternative is!

John Bollinger
  • 160,171
  • 8
  • 81
  • 157