1

I'm trying to understand how to think about memory allocation from a practical perspective. I think I understand how to implement malloc and realloc in a basic sense, but I don't understand how to do it efficiently.

The examples I have in my books and online references, all show realloc being used within a loop, such as to increase the length of an array. I don't know enough to be sure but it appears that that approach may not be very efficient.

For example, in the test code below, a JSON string is being built to pass the file name, size, and date last modified to another application. It's not known in advance how many files are in the directory or the size of the data elements. I realize this is simple, but I have a similar issue in building a JSON string of SQLite result rows when the number of rows and their sizes are not known in advance. This file information code example is simpler to follow but the more important task is the SQLite data.

In the code below, an opening piece of JSON is written to the character array json, then each piece of file information is written in the while loop, after which the JSON string is closed. All is written using snprintf.

Right now, json is arbitrarily set to length 1000; and the code does produce a properly formed JSON string of this data, so long as it fits.

My question is, should realloc() really be used in each iteration of the while loop to extend the size of json once the size of the next data to be written is known? Or, is realloc() a bit of work, and it would be better to allocate a larger block of memory at the start and then extend that with another larger block once the remaining space in json reaches a minimum or snprintf returns a value indicating a write truncation, such that there are fewer calls to realloc()?

How is this done in "real life" rather than examples in books for illustration?

Thank you.

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <dirent.h>
#include <fcntl.h>
#include <time.h>
#include <errno.h>

void read_dir_1 ( char *, char * );

int main( void )
  {
    read_dir_1( "c", "./SQLite3/" );
    return 0;
  }

void read_dir_1 ( char *ext, char *path )
{
  DIR *dp;
  struct dirent *ep;
  struct stat info;
  int rc, q, i, l;
  char *t;
  char fl_name[ 400 ]; 
  char json[ 1000 ]; // Temporary value.

  dp = opendir( path );

  if ( dp != NULL )
    {
      i = 0;
      q = 0;   
      l = sizeof( json ) - 1;
      // Open the JSON string.
      q = snprintf( json, l, "%s", "{\"c\":\"A\",\"d\":[" );
      l -= q;
      i += q;

      while ( ep = readdir( dp ) )
        {
          rc = snprintf( fl_name,
                         sizeof( fl_name ) - 1,
                         "%s%s",
                         path,
                         ep->d_name
                        );

          if ( ( rc = stat( fl_name, &info ) ) != 0 )
             {
               printf( "rc : %d\n", rc );
               printf( "errno : %d, strerror : %s\n", errno, strerror( errno ) );
               continue;
             }

          if ( ( info.st_mode & S_IFMT ) != S_IFREG ) 
            continue;

          t = strrchr( ep->d_name, '.' ); 
          if ( !t || strcmp( ++t, ext ) != 0 ) continue;

          q = snprintf( json + i,
                        l,
                        "%s%.*s%s%d%s%d%s",   
                        "{\"n\":\"",
                        strlen( ep->d_name ) - strlen( ext ) - 1, ep->d_name,
                        "\",\"s\":",
                        info.st_size / 1024,
                        ",\"d\":",
                        info.st_mtime,
                        "}," );                   

          if ( q < 0  || q > l )
            {
              // Ran out of room in json to hold the file info.
              printf( "Ran out of memory to store file info." );
              //?????????? Better free all allocated memory before returning.
              return;
            }
          i += q;
          l -= q;
        } // next while loop

      // Close the JSON string. Overwrite the comma after the last object in the array "d".
      q = snprintf( json + i - 1,
                    l,
                    "%s",
                    "]}" );                   
    
      if ( q < 0  || q > l )
        {
          // Ran out of room in json to hold the file info.
          printf( "Ran out of memory to store file info." );
          //?????????? Better free all allocated memory before returning.
          return;
        }

      printf( "JSON : %s", json );
      closedir (dp);
    }
  else
    {
      perror ("Couldn't open the directory");
    }

  //?????????? Better free all allocated memory before returning.
  return;
}
Gary
  • 2,393
  • 12
  • 31
  • 3
    Memory allocation is relatively costly. An efficient approach is to allocate some reasonable sized block, keep track of how much is used, and only `realloc()` when you run out of space and then `realloc()` say 2X the original size and repeat until done. You can then call `realloc()` a final time to shrink the allocation to fit if needed. You want to avoid calling `realloc()` per-iteration. (Now in nested-loops you may see `realloc()` called per-iteration in the outer-loop which being used as described above in the inner loop) Allocating 1000 bytes (or 1K) is fine for general use. – David C. Rankin Nov 30 '20 at 06:10
  • Adding onto @DavidC.Rankin's comment, you can have a look at "jemalloc" (http://jemalloc.net/). – Rahul Bharadwaj Nov 30 '20 at 06:16
  • 1
    You can also use `int needed = sprintf (NULL, 0, "format string", vars ...);` to determine the number of characters *needed* for any resulting `"format string"` you need to create. That allows you to size you allocation (+1 byte) exactly to the length of string needed. If I were reading and composing json records that could be say a max of 4K in length, thank I would use (and reuse) an 8K buffer -- which you can allocate if there is the outside chance you need to `realloc()`, otherwise just using plain old stack-storage for an 8K buffer is fine. – David C. Rankin Nov 30 '20 at 06:28
  • @DavidC.Rankin Thank you, That is very helpful. I appreciate it. Moving from the book explanations to proper practice/implementation when not really working in the field is abit diffiicult. – Gary Nov 30 '20 at 06:29
  • Amen, I feel your pain. There are a number of good examples you can probably locate searching `"[c] dynamic memory allocation"`. (there are some bad examples too) Just make sure what you are using as a guide roughly follows what is outlined above. – David C. Rankin Nov 30 '20 at 06:32
  • @DavidC.Rankin Thanks for the additional infromation. I wondered about reusing blocks of memory, in terms of consuming an amount of RAM when the application is first opened and having the functions use/share it, rather than each function allocating and freeing memory repeatedly. Especially for processes that I know are going to be invoked many, many times per session, such as getting data from SQLite, converting it to JSON, and writing it to stdout. – Gary Nov 30 '20 at 06:41
  • When you have a block of memory, it doesn't matter whether it is plain-old stack memory or memory you have allocated. You can use and reuse the storage as many times as you like, provided you are not overwriting data you actually need to keep for later use. But bytes-are-just-bytes, you can use them as needed. If I can use the same buffer throughout rather than creating many new ones, I prefer that. Though know that you are free to create as much or as little as you need. Using whatever storage you use -- correctly is far more important that what storage you use... – David C. Rankin Nov 30 '20 at 07:07

0 Answers0