-1

I'm using Bison to generate a parser and the lexer is written in Ragel. When I run my code with Valgrind, it gives me this output:

200,014 bytes in 2 blocks are definitely lost in loss record 139,468 of 139,600
  malloc (vg_replace_malloc.c:393)
 *yyparse (parser.c:1768)
 *GraphQL_CParser_Parser_c_parse (graphql_c_parser_ext.c:8)
  vm_call_cfunc_with_frame (vm_insnhelper.c:3268)
  vm_sendish (vm_insnhelper.c:5080)
  vm_exec_core (insns.def:820)
  rb_vm_exec (vm.c:2374)
  ...

(There's more to it -- I'm calling into the parser from Ruby. The full build output is available here, for now: https://github.com/rmosolgo/graphql-ruby/actions/runs/4556195698/jobs/8036241668?pr=4410)

It points to my (generated) yyparse function, at code like this:

# else /* defined YYSTACK_RELOCATE */
      /* Extend the stack our own way.  */
      if (YYMAXDEPTH <= yystacksize)
        YYNOMEM;
      yystacksize *= 2;
      if (YYMAXDEPTH < yystacksize)
        yystacksize = YYMAXDEPTH;

      {
        yy_state_t *yyss1 = yyss;
        union yyalloc *yyptr =
          YY_CAST (union yyalloc *,
                   YYSTACK_ALLOC (YY_CAST (YYSIZE_T, YYSTACK_BYTES (yystacksize))));
        if (! yyptr)
          YYNOMEM;
        YYSTACK_RELOCATE (yyss_alloc, yyss);
        YYSTACK_RELOCATE (yyvs_alloc, yyvs);
#  undef YYSTACK_RELOCATE
        if (yyss1 != yyssa)
          YYSTACK_FREE (yyss1);
      }

How can I address this error message from Valgrind? I'm not sure if this is a legitimate memory leak and I should find a way to free that allocated memory, or whether it's a false positive and I should suppress this message.

Is there anything I can do to debug further?

The entire generated C parser here, in case it helps: https://github.com/rmosolgo/graphql-ruby/blob/a048682e6d8468d16947ee1c80946dd23c5d91f9/graphql-c_parser/ext/graphql_c_parser_ext/parser.c

And the Bison definition is available here: https://github.com/rmosolgo/graphql-ruby/blob/a048682e6d8468d16947ee1c80946dd23c5d91f9/graphql-c_parser/ext/graphql_c_parser_ext/parser.y

Thanks for taking a look!

tadman
  • 208,517
  • 23
  • 234
  • 262
rmosolgo
  • 1,854
  • 1
  • 18
  • 23
  • It's unlikely to be a false positive. Gotta love all those cruddy obfuscating macros. YYSTACK_ALLOC is just malloc. From what I see YYSTACK_RELOCATE does yyss = &yyptr->yyss_alloc amongst other things. yyptr itself doesn't seem to be stored anywhere. Does this post help https://stackoverflow.com/questions/43671389/minimal-bison-flex-generated-code-has-memory-leak ? – Paul Floyd Apr 12 '23 at 16:37
  • Hey, thanks for taking a look! I did see that question. The catch is, it's using flex (instead of ragel) and flex provides the yydestroy function mentioned there. (To confirm, I tried adding a yydestroy call, it crashed with "dyld[31254]: missing symbol called".) It seems like the same _kind_ of bug, but I need a different approach. Maybe I can take a look at the yydestroy implementation for inspiration. – rmosolgo Apr 12 '23 at 18:17
  • Could you add a minimal code required to reproduce the problem? – Piotr Siupa Apr 21 '23 at 23:24

1 Answers1

0

I incline to believe that the behaviour you observe is a Bison bug. Below is a short summary of why I think so and what happens with memory allocations in this generated yyparse() function:

  1. The parser uses two stacks, state stack yyss and value stack yyvs (defined here). The initial space for them is reserved as local array with depth YYINITDEPTH. If this depth isn't enough, additional allocations are performed from the heap (malloc).
  2. When allocating larger memory, it is allocated as a single block that is used to hold both stacks together. Data from the previous memory block is copied to the new one.
  3. The allocation us done by YYSTACK_ALLOC on line 1775. If previous allocation took place, that memory is definitely freed by YYSTACK_FREE on line 1782.
  4. The last memory block that is used is supposed to be freed on line 3224. My guess that there is some scenario when the function returns without passing through this line.

It obviously requires some debugging to understand what's going on here. I can suggest some general points on how I would approach it:

  1. First, check if the memory leak is deterministically reproducible. Does it happens every time when you run this pipeline?

  2. Try to simplify the reproduction setup. Work towards making it shorter, simpler and run on your local development machine.

    • Understand what is the command line command being run by the build system to invoke Valgrind. What other files does it need? Try running the same command on your dev machine. Check if the problem still reproduces.

    • Then, work towards shortening the run cycle. As far as I understand, it runs a lot of tests under Valgrind, taking more than 20 minutes. Try bisecting the test set to understand if the problem is related to a specific test or just depends on the number of tests.

  3. If you succeed to minify the reproduction setup, you can run it under debugger or add some tracing.

The tracing I would add is some naive printing to log file like this along with counting allocation/free mismatch conditions:

// my_alloc.h
#include <stdio.h>
#include <stdlib.h>

extern FILE* g_my_trace_file;
extern size_t g_alloc_count;


#define MY_TRACE(...) \
    do { \
        if (!g_my_trace_file) \
            g_my_trace_file = fopen("my_trace.txt", "a"); \
         \
        fprintf(g_my_trace_file, __VA_ARGS__); \
        fflush(g_my_trace_file); \
    } while(0)
        
inline void *my_alloc( size_t size ) { \
    p = malloc(size); \
    MY_TRACE("yyparse: Allocated block of %z bytes at addr %p\n", size, p); \
    ++g_alloc_count; \
    return p; \
}

inline void free( void *p ) { \
    MY_TRACE("yyparse: Freed block at addr %p\n", p); \
    --g_alloc_count;
    free(p); \
}   

And add this in some .c file:

FILE* g_my_trace_file = NULL;
size_t g_alloc_count = 0;

In parser.c replace YYSTACK_ALLOC/YYSTACK_FREE on lines 628, 629 with my_alloc/my_free and include my_alloc.h.

In graphql_c_parser_ext.c add the following lines in GraphQL_CParser_Parser_c_parse:

VALUE GraphQL_CParser_Parser_c_parse(VALUE self) {
  g_alloc_count = 0;
  VALUE filename = rb_ivar_get(self, rb_intern("@filename"));
  MY_TRACE("Calling yyparse for file %s\n", filename);  // <-- Here I'm not sure how to obtain string from VALUE to print it... Whats the type of VALUE?
  yyparse(self, filename);
  MY_TRACE("Finished yyparse for file %s, leak count: %z\n", filename, g_alloc_count);
  return Qnil;
}

From this log you should be able to pinpoint parsing of which file causes the allocation leak.

Keep us updated on the findings for further assistance :-)

vvv444
  • 2,764
  • 1
  • 14
  • 25
  • Thanks for taking a look -- I came to a different conclusion. What do you think of this: Since `yyss` points to the beginning of the double-purpose stack, wouldn't `free(yyss)` free the _whole thing_ (including the later part of that stack, pointed to by `yyvs`)? My guess is that `free(yyss)` also frees `yyvs`. – rmosolgo Apr 26 '23 at 01:21
  • @rmosolgo Glad you found my answer helpful. Yes, it allocates both stacks together in one block and frees them both together. That's by design. So I don't see the problem with that. – vvv444 Apr 26 '23 at 06:23