3

In debugging a memory leak on a large project, I found that the source of the leak seemed to be some flex/bison-generated code. I was able to recreate the leak with the following minimal example consisting of two files, sand.l and sand.y:

in sand.l:

%{
#include <stdlib.h>
#include "sand.tab.h"
%}

%%
[0-9]+ { return INT; }
. ;
%%

in sand.y:

%{
#include <stdio.h>
#include <stdlib.h>

int yylex();
int yyparse();
FILE* yyin;

void yyerror(const char* s);
%}

%token INT

%%
program:
       program INT { puts("Found integer"); }
       | 
       ;
%%

int main(int argc, char* argv[]) {
    yyin = stdin;
    do {
        yyparse();
    } while (!feof(yyin));
    return 0;
}

void yyerror(const char* s) {
    puts(s);
}

The code was compiled with

$ bison -d sand.y
$ flex sand.l
$ gcc -g lex.yy.c sand.tab.c -o main -lfl

Running the program with valgrind gave the following error:

8 bytes in 1 blocks are still reachable in loss record 1 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x402126: yyensure_buffer_stack (lex.yy.c:1423)
by 0x400B89: yylex (lex.yy.c:669)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

64 bytes in 1 blocks are still reachable in loss record 2 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CBF: yy_create_buffer (lex.yy.c:1258)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

16,386 bytes in 1 blocks are still reachable in loss record 3 of 3
at 0x4C2AC3D: malloc (vg_replace_malloc.c:299)
by 0x40260F: yyalloc (lex.yy.c:1723)
by 0x401CF6: yy_create_buffer (lex.yy.c:1267)
by 0x400BB3: yylex (lex.yy.c:671)
by 0x402975: yyparse (sand.tab.c:1114)
by 0x402EC4: main (sand.y:24)

It seems that bison and/or flex is holding on to a substantial amount of memory. Is there anyway to force them to free it?

iafisher
  • 938
  • 7
  • 14
  • A possibly related question http://stackoverflow.com/questions/22534993/where-to-free-memory-in-bison-flex – DYZ Apr 28 '17 at 03:36

1 Answers1

5

The default flex skeleton allocates an input buffer and a small buffer stack, which it never frees. You could free the input buffer manually with yy_delete_buffer(YY_CURRENT_BUFFER); but there is no documented way to delete the buffer stack. If you have a sufficiently non-ancient version of flex [see Note 1], you can call yylex_destroy() to remove the last vestiges of the buffer stack. (If you don't, it's only 8 bytes in your application, so it's not a disaster.)

If you want to write a clean application, you should generate a reentrant scanner, which puts all persistent data into a scanner context object. Your code must allocate and free this object, and freeing it will free all memory allocations. (You might also want to generate a pure parser, which works roughly the same way.)

However, the reentrant scanner has a very different API, so you will need to get your parser to pass through the scanner context object. If you use a reentrant (pure) parser as well, you'll need to modify your scanner actions because with the reentrant parser, yylval is a YYSTYPE* instead of YYSTYPE.


Notes:

  1. In fact, you can delete the buffer stack using yylex_destroy(), as recently pointed out in a comment, as long as your flex version is at least 2.5.9. Since that version was released almost two decades ago, you'd think this note would be unnecessary, but unfortunately v2.5.4 continues to be the default MinGW installation and it had a surprisingly long life on various Linux distros as well (although I think these days you're not so likely to find it installed).
rici
  • 234,347
  • 28
  • 237
  • 341
  • The 8 bytes can be freed. One must invoke `yylex_destroy();` instead of `yy_delete_buffer(YY_CURRENT_BUFFER);` – Aleksander Bobiński Oct 29 '20 at 10:56
  • @AleksanderBobiński: yes, you're correct and I updated the answer. When I wrote it, there were still lots of v2.5.4 installs which lack this interface. – rici Oct 29 '20 at 21:30