0

I have very strange problem. I have simple python c++ parser that looks for certain macro instantiations in my c++ code. And I have very strange problem

I have two files, test.cpp and macro.hpp, I'm parsing test.cpp

//macro.hpp
#define REGISTER_MODULE_INITIALIZER( Type ) \
namespace module_registrator { \
void Type##__Register() \
{ \
} \
}

//test.cpp
#include <iostream>
#include <macro.hpp>
 
namespace something
{
class MyClass
{
};
 
REGISTER_MODULE_INITIALIZER( MyClass );
}
}

If I remove #include , put it after #include <macro.hpp> or define macro direct in test.cpp - everything is fine, in ast I see function MyClass__Register(). But with iostream parser sees just some variable with name MyClass, like it couldn't find macro include (include directive is present in ast)

Parser looks like this, adding stdc++ include path to parser_args changed nothing.

def traverse(node):
    if node.kind == clang.cindex.CursorKind.MACRO_INSTANTIATION:
        if node.displayname == 'REGISTER_MODULE_INITIALIZER':
            macro_locations.append([node.extent.start.line, node.extent.start.column, node.extent.start.file.name])
    elif node.kind == clang.cindex.CursorKind.FUNCTION_DECL:
        if [node.extent.start.line, node.extent.start.column, node.extent.start.file.name] in macro_locations \
            and node.displayname.endswith('__Register()'):
            nsp = get_namespace(node.lexical_parent)
            nsp.objects.append(node.displayname)
    print(str(node.kind) + ' ' + node.displayname)
    for child in node.get_children():
        traverse(child)
 
parser_args = ['-std=c++17']
 
clang.cindex.Config.set_library_path(clang_lib_path)
index = clang.cindex.Index.create()
 
tu = index.parse('test.cpp',
                 options=clang.cindex.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD |
                 clang.cindex.TranslationUnit.PARSE_INCOMPLETE,
                 args=parser_args)
traverse(tu.cursor)

Clang 7.0.1 OS CentOS Python 3.7

UPD: If I add several std headers (I've tried memory and string), first of them will be parsed, all other std and my headers after it will be ignored. If I add several local headers they all are parsed normally.

Crazy Sage
  • 414
  • 2
  • 14
  • No idea if that's at all related but your code is UB since you are using an illegal identifier: double underscore, as well as underscore followed by a capital letter, are reserved for the implementation. – Konrad Rudolph Nov 02 '22 at 08:33
  • @KonradRudolph nope, changed it to _register() with no effect. – Crazy Sage Nov 02 '22 at 08:35
  • @KonradRudolph Not using identifiers with starting __ is a convention (not a language feature) so how is using __ resulting in UB? (In this case the macro constructs identifiers with __ in the middle). OP: I have no idea why it is doing this yet. – Pepijn Kramer Nov 02 '22 at 08:53
  • @PepijnKramer Nope, it isn't merely a convention. It's a hard requirement by the standard ([lex.name]/3.1). – Konrad Rudolph Nov 02 '22 at 08:58
  • Not to be mean Konrad, but look at the title of the section : `5 Lexical conventions`. They will not be enforced by the compiler to allow library builders to use `__` internally. That ofcourse still means that `__` should not be used in normal C++ code. But it will not lead to UB if you do. – Pepijn Kramer Nov 02 '22 at 09:10
  • @PepijnKramer This simply isn't subject to interpretation. The title is just that: a title (though admittedly misleading). The normative text is clear: “shall not be used”, and “no diagnostic required”. That’s what UB means. – Konrad Rudolph Nov 02 '22 at 09:13
  • Ok I see what you mean now, after reading https://en.cppreference.com/w/cpp/language/ndr, I see misinterpreted : "no diagnostics required" (I interpreted is as no compiler warnings need to be given) thus indeed a program with identifier starting with `__` or `_[A-Z]` isill formed. Apologies to you and OP for messing up the comments. (At least I improved my standardeze a bit today) – Pepijn Kramer Nov 02 '22 at 09:19

0 Answers0