0

I want to read part of a very very large compressed file(119.2 GiB if decompressed) with this piece of code.

FILE* trace_file;
char gunzip_command[1000];
sprintf(gunzip_command, "gunzip -c %s", argv[i]); // argv[i]: file path
trace_file = popen(gunzip_command, "r");
fread(&current_cloudsuite_instr, instr_size, 1, trace_file)

Does popen in c load whole output of the command into memory? If it does not, does popen save the the whole output of the command in a tmp file(in disk)? As you can see, the output of decompressing will be too large. Neither memory nor disk can hold it. I only know that popen creates a pipe.

$ xz -l 649.fotonik3d_s-1B.champsimtrace.xz  
Strms  Blocks   Compressed Uncompressed  Ratio  Check   Filename
    1       1     24.1 MiB    119.2 GiB  0.000  CRC64   649.fotonik3d_s-1B.champsimtrace.xz
Tokubara
  • 392
  • 3
  • 13
  • `popen()` usually won't use a physical file, it represents the output of the executable as buffered stream in the callers space. Buffered means not the whole output will be read into memory. – πάντα ῥεῖ Mar 22 '22 at 09:26
  • 1
    When you use `popen(..., "r")` the `popen` function creates a *pipe*, and makes the programs standard output write to the pipe. You then read the output of the commands from the pipe. And pipes are *buffered*, if the buffer is full then all write operations to the pipe will *block*. Which means the running program will pause. In short: Only a very small potion of the uncompressed file will be in memory. – Some programmer dude Mar 22 '22 at 09:29

1 Answers1

3

I only know that popen creates a pipe.

There are two implementations of pipes that I know:

On MS-DOS, the whole output was written to disk; reading was then done by reading the file.

It might be that there are still (less-known) modern operating systems that work this way.

However, in most cases, a certain amount of memory is reserved for the pipe.

The xz command can write data until that amount of memory is full. If the memory is full, xz is stopped until memory becomes available (because the program that called popen() reads data).

If the program that called popen() reads data from the pipe, data is removed from the memory so xz can write more data.

When the program that called popen() closes the file handle, writing to the pipe has no more effect; an error message is reported to xz ...

Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38