0

I add a Matcher in callExpr, Finder->addMatcher(callExpr().bind("call"), this); In my check function, I want to get parent function name in this callexpr node, here is my checkfunction context. Parent varible always returns nullptr. How to deal with it, thanks.

const auto *MatchedCallExpr = Result.Nodes.getNodeAs<CallExpr>("call");
ASTContext::DynTypedNodeList NodeList = Result.Context->getParents(*MatchedCallExpr);

ast_type_traits::DynTypedNode ParentNode = NodeList[0];
const FunctionDecl *Parent = ParentNode.get<FunctionDecl>();

string FunctionName {};
if (Parent == nullptr) {
  return;
}
FunctionName = Parent->getNameInfo().getAsString();

Here is a AST example

-FunctionDecl 0x7802e60 <line:227:1, line:246:1> line:227:9 invalid case15 'int32_t (int *, int)'
  |-ParmVarDecl 0x7802cf0 <col:16, col:26> col:26 invalid lanxState 'int *'
  |-ParmVarDecl 0x7802d80 <col:37, col:47> col:47 invalid lanxLength 'int'
  `-CompoundStmt 0x7804458 <line:228:1, line:246:1>
    |-DeclStmt 0x7802ff0 <line:229:5, col:25>
    | `-VarDecl 0x7802f20 <col:5, col:24> col:12 used format_str 'std::string':'std::__cxx11::basic_string<char>' listinit destroyed
    |   `-CXXConstructExpr 0x7802fc8 <col:12, col:24> 'std::string':'std::__cxx11::basic_string<char>' 'void () noexcept(is_nothrow_default_constructible<allocator<char>>::value)' list
    |-DeclStmt 0x7803fb0 <line:235:5, line:238:58>
    | `-VarDecl 0x7803ce0 <line:235:5, col:14> col:14 used ret 'uint32_t':'unsigned int'
    |-CallExpr 0x78043c8 <line:244:5, col:40> 'void'
    | |-ImplicitCastExpr 0x78043b0 <col:5> 'void (*)(const char *, char *)' <FunctionToPointerDecay>
    | | `-DeclRefExpr 0x7804360 <col:5> 'void (const char *, char *)' lvalue Function 0x78029f0 'func' 'void (const char *, char *)'
Langewin
  • 1
  • 2
  • Show an example of code you are scanning (the smaller the better) that contains a function call, show the output of `clang -Xclang -ast-dump -fsyntax-only ` on that example, and in that dump, indicate which node you have (`MatchedCallExpr`) and which node you want to get (`Parent`). – Scott McPeak Jun 10 '22 at 10:01
  • This is one solution, but it may only cover one scenario. Do we need to know the AST context structure of callexpr to get the parent function name of callexpr? Is there a good way I can get or implement a generic API for getting parent functions? – Langewin Jun 13 '22 at 02:15
  • I don't know what you mean by "parent". Do you mean the immediate AST node ancestor, or the name of the callee, or the name of the function whose body contains the call, or something else? That's why I suggest annotating an AST dump and explicitly showing which node you are trying to get. – Scott McPeak Jun 13 '22 at 08:13
  • "the name of the function whose body contains the call" that is what I want. In my truncated example, my matcher reach CallExpr position(line:244:5), how to get the FunctionDecl position and name inside FunctionDecl? The AST dump is very long, so I cut out the key context. – Langewin Jun 15 '22 at 07:12

1 Answers1

1

Obtaining the enclosing FunctionDecl

Given a matched CallExpr, how do we find the FuncionDecl whose body contains the expression?

If you have any Clang AST node, you can use clang::ASTContext::getParents() to get its immediate ancestor AST node. This returns a clang::DynTypedNodeList, which in general can contain any number of nodes, including zero. (I think getParents() can only return multiple parents when working with C++ templates, but I'm not sure.)

For example:

const auto *MatchedCallExpr = Result.Nodes.getNodeAs<CallExpr>("call");
clang::DynTypedNodeList NodeList = Result.Context->getParents(*MatchedCallExpr);

However, the FunctionDecl that contains the expression will not be the parent of the call expression. In the AST dump from the question, we see:

-FunctionDecl 0x7802e60 <line:227:1, line:246:1> line:227:9 invalid case15 'int32_t (int *, int)'
  |-...
  `-CompoundStmt 0x7804458 <line:228:1, line:246:1>
    |-...
    |-CallExpr 0x78043c8 <line:244:5, col:40> 'void'

The parent of this CallExpr is a CompoundStmt. The parent of the CompoundStmt is the FunctionDecl we are after. In general there can be arbitrarily many nodes in between. So, we need to call getParents() in a loop until we reach a FunctionDecl, for example:

clang::DynTypedNodeList NodeList = Result.Context->getParents(*MatchedCallExpr);
while (!NodeList.empty()) {
  // Get the first parent.
  clang::DynTypedNode ParentNode = NodeList[0];

  // You can dump the parent like this to inspect it.
  //ParentNode.dump(llvm::outs(), *(Result.Context));

  // Is the parent a FunctionDecl?
  if (const FunctionDecl *Parent = ParentNode.get<FunctionDecl>()) {
    llvm::outs() << "Found ancestor FunctionDecl: "
                 << (void const*)Parent << '\n';
    llvm::outs() << "FunctionDecl name: "
                 << Parent->getNameAsString() << '\n';
    return;
  }

  // It was not a FunctionDecl.  Keep going up.
  NodeList = Result.Context->getParents(ParentNode);
}

llvm::outs() << "Ran out of ancestors.\n";

The message Ran out of ancestors. will be printed when there is a call expression that is not contained inside a FunctionDecl, for example, at file scope:

int f();           // Some function to call.
int x = f();       // Global variable initializer.

Complete clang-tidy example program

Below are the files required to build and run a complete clang-tidy program illustrating the method described above. It assumes that Clang+LLVM 14.0.0 in unpacked into $HOME/opt/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04; change CLANG_LLVM_INSTALL_DIR in the Makefile if not.

GetEnclFunc.h:

// GetEnclFunc.h
// clang-tidy check demonstrating getting the enclosing FunctionDecl.

#ifndef GET_ENCL_FUNC_H
#define GET_ENCL_FUNC_H

#include "clang-tidy/ClangTidyCheck.h"           // ClangTidyCheck
#include "clang/ASTMatchers/ASTMatchFinder.h"    // ast_matchers::MatchFinder

namespace clang {
namespace tidy {

class GetEnclFunc : public ClangTidyCheck {
public:
  GetEnclFunc(StringRef Name, ClangTidyContext *Context)
      : ClangTidyCheck(Name, Context) {}
  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
};

} // namespace tidy
} // namespace clang

#endif // GET_ENCL_FUNC_H

GetEnclFunc.cc:

// GetEnclFunc.cc
// Code for GetEnclFunc.h.

#include "GetEnclFunc.h"                         // this module

#include "clang/AST/ASTContext.h"                // ASTContext
#include "clang/AST/ASTTypeTraits.h"             // clang::DynTypedNode
#include "clang/AST/ParentMapContext.h"          // clang::DynTypedNodeList

#include "clang-tidy/ClangTidyModule.h"          // ClangTidyModule
#include "clang-tidy/ClangTidyModuleRegistry.h"  // ClangTidyModuleRegistry

using namespace clang::ast_matchers;

namespace clang {
namespace tidy {

void GetEnclFunc::registerMatchers(ast_matchers::MatchFinder *Finder)
{
  // Match any function call expression.
  Finder->addMatcher(callExpr().bind("call"), this);
}

void GetEnclFunc::check(const MatchFinder::MatchResult &Result)
{
  // Get the node bound to "call" in the match expression.
  const auto *MatchedCallExpr = Result.Nodes.getNodeAs<CallExpr>("call");

  // Get its parent nodes.  The docs do not realy explain why there can
  // be multiple parents, but I think it has to do with C++ templates.
  clang::DynTypedNodeList NodeList = Result.Context->getParents(*MatchedCallExpr);
  while (!NodeList.empty()) {
    // Get the first parent.
    clang::DynTypedNode ParentNode = NodeList[0];

    // You can dump the parent like this to inspect it.
    //ParentNode.dump(llvm::outs(), *(Result.Context));

    // Is the parent a FunctionDecl?
    if (const FunctionDecl *Parent = ParentNode.get<FunctionDecl>()) {
      llvm::outs() << "Found ancestor FunctionDecl: "
                   << (void const*)Parent << '\n';
      llvm::outs() << "FunctionDecl name: "
                   << Parent->getNameAsString() << '\n';
      return;
    }

    // It was not a FunctionDecl.  Keep going up.
    NodeList = Result.Context->getParents(ParentNode);
  }

  llvm::outs() << "Ran out of ancestors.\n";
}

class HelloModule : public ClangTidyModule {
public:
  void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override {
    CheckFactories.registerCheck<GetEnclFunc>("GetEnclFunc");
  }
};

static ClangTidyModuleRegistry::Add<HelloModule> X("GetEnclFunc-module",
                                                   "Adds GetEnclFunc check.");

// This is defined in libclangTidyMain.a.  It does not appear to be
// declared in any header file, so I doubt this is really how it is
// meant to be used.
int clangTidyMain(int argc, const char **argv);

} // namespace tidy
} // namespace clang

int main(int argc, const char **argv)
{
  return clang::tidy::clangTidyMain(argc, argv);
}

// EOF

Makefile:

# clang-tidy-get-enclosing-func/Makefile

# Default target.
all:
.PHONY: all


# Eliminate all implicit rules.
.SUFFIXES:

# Delete a target when its recipe fails.
.DELETE_ON_ERROR:

# Do not remove "intermediate" targets.
.SECONDARY:


# ---- Paths ----
# Installation directory from a binary distribution.
# Has five subdirectories: bin include lib libexec share.
# Downloaded from: https://github.com/llvm/llvm-project/releases/download/llvmorg-14.0.0/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04.tar.xz
CLANG_LLVM_INSTALL_DIR = $(HOME)/opt/clang+llvm-14.0.0-x86_64-linux-gnu-ubuntu-18.04

# Program to query the various LLVM configuration options.
LLVM_CONFIG = $(CLANG_LLVM_INSTALL_DIR)/bin/llvm-config


# ---- Compiler options ----
# C++ compiler.
CXX = g++

# Compiler options, including preprocessor options.
CXXFLAGS =

CXXFLAGS += -Wall

# Get llvm compilation flags.
CXXFLAGS += $(shell $(LLVM_CONFIG) --cxxflags)

# Linker options.
LDFLAGS =

# Needed libraries.  The order is important.  I do not know a principled
# way to obtain this list.  I did it by chasing down each missing symbol
# in a link error.
LDFLAGS += -lclangTidy
LDFLAGS += -lclangTidyMain
LDFLAGS += -lclangTidyPlugin
LDFLAGS += -lclangToolingCore
LDFLAGS += -lclangFormat
LDFLAGS += -lclangToolingInclusions
LDFLAGS += -lclangTidyAbseilModule
LDFLAGS += -lclangTidyAlteraModule
LDFLAGS += -lclangTidyAndroidModule
LDFLAGS += -lclangTidyBoostModule
LDFLAGS += -lclangTidyBugproneModule
LDFLAGS += -lclangTidyCERTModule
LDFLAGS += -lclangTidyConcurrencyModule
LDFLAGS += -lclangTidyCppCoreGuidelinesModule
LDFLAGS += -lclangTidyDarwinModule
LDFLAGS += -lclangTidyFuchsiaModule
LDFLAGS += -lclangTidyGoogleModule
LDFLAGS += -lclangTidyHICPPModule
LDFLAGS += -lclangTidyLLVMLibcModule
LDFLAGS += -lclangTidyLLVMModule
LDFLAGS += -lclangTidyLinuxKernelModule
LDFLAGS += -lclangTidyMPIModule
LDFLAGS += -lclangTidyMiscModule
LDFLAGS += -lclangTidyModernizeModule
LDFLAGS += -lclangTidyObjCModule
LDFLAGS += -lclangTidyOpenMPModule
LDFLAGS += -lclangTidyPerformanceModule
LDFLAGS += -lclangTidyPortabilityModule
LDFLAGS += -lclangTidyReadabilityModule
LDFLAGS += -lclangTidyZirconModule
LDFLAGS += -lclangTidyUtils
LDFLAGS += -lclangTransformer
LDFLAGS += -lclangTooling
LDFLAGS += -lclangFrontendTool
LDFLAGS += -lclangFrontend
LDFLAGS += -lclangDriver
LDFLAGS += -lclangSerialization
LDFLAGS += -lclangCodeGen
LDFLAGS += -lclangParse
LDFLAGS += -lclangSema
LDFLAGS += -lclangStaticAnalyzerFrontend
LDFLAGS += -lclangStaticAnalyzerCheckers
LDFLAGS += -lclangStaticAnalyzerCore
LDFLAGS += -lclangAnalysis
LDFLAGS += -lclangARCMigrate
LDFLAGS += -lclangRewrite
LDFLAGS += -lclangRewriteFrontend
LDFLAGS += -lclangEdit
LDFLAGS += -lclangCrossTU
LDFLAGS += -lclangIndex
LDFLAGS += -lclangAST
LDFLAGS += -lclangASTMatchers
LDFLAGS += -lclangLex
LDFLAGS += -lclangBasic
LDFLAGS += -lclang

# *After* clang libs, the llvm libs.
LDFLAGS += $(shell $(LLVM_CONFIG) --ldflags --libs --system-libs)


# ---- Recipes ----
# Pull in automatic dependencies.
-include $(wildcard obj/*.d)

# Compile a C++ source file.
obj/%.o: %.cc
    @mkdir -p $(dir $@)
    $(CXX) -MMD -c -o $@ $(USE_PCH) $< $(CXXFLAGS)

# Sources for 'GetEnclFunc.exe'.
SRCS :=
SRCS += GetEnclFunc.cc

# Objects for 'GetEnclFunc.exe'.
OBJS := $(patsubst %.cc,obj/%.o,$(SRCS))

# Executable.
all: GetEnclFunc.exe
GetEnclFunc.exe: $(OBJS)
    $(CXX) -g -Wall -o $@ $(OBJS) $(LDFLAGS)

# Run program on one input.
out/%: in/% GetEnclFunc.exe
    @mkdir -p $(dir $@)
    ./GetEnclFunc.exe -checks=-*,GetEnclFunc in/$* --
    touch $@

# Run tests.
.PHONY: check
check: GetEnclFunc.exe
check: out/hascall.cc

# Remove test outputs.
.PHONY: check-clean
check-clean:
    rm -rf out

# Remove compile and test outputs.
.PHONY: clean
clean: check-clean
    $(RM) *.exe
    rm -rf obj


# EOF

in/hascall.cc:

// hascall.cc
// Example input for GetEnclFunc check.

int f();

void caller()
{
  // Example of a call within a function.
  f();
}

// Call that is not inside a function.
int x = f();

// EOF

Running the check:

$ make check
./GetEnclFunc.exe -checks=-*,GetEnclFunc in/hascall.cc --
Found ancestor FunctionDecl: 0x5569e54f0f78
FunctionDecl name: caller
Ran out of ancestors.
touch out/hascall.cc
Scott McPeak
  • 8,803
  • 2
  • 40
  • 79