1

I wrote a code to analyize javascript code lexeme with c++. Below is the corresponding function. For reference, Kind is a enum class that contains kinds of lex, and definition of a function toKind is below of the function analyzeLexeme.

auto analyzeLexeme(std::wstring & source_code) -> std::vector<Token> {
    using std::__1::wregex;
    using std::regex_search;
    using std::vector;
    using std::wsmatch;
    using std::wstring;
    using std::string;
    using std::tuple;
    using std::map;

    wsmatch matched;
    Kind    result_kind;

    map<string, bool> context  = { { "in_tmplt", false }, };
    auto              result   = vector<Token>();
    auto              reSearch = [&source_code, &matched] (const wregex & re) {
        return regex_search(source_code, matched, re);
    };

    source_code += L'\0';

    for (; source_code[0]; source_code = matched.suffix()) {
        if (reSearch(kReWhiteSpace)) continue;

        result.push_back({(
                reSearch(kReNumLiteral)                             ? [] { return Kind::NumberLiteral; }                       :
                reSearch(kReStrLiteral)                             ? [] { return Kind::StringLiteral; }                       :
                reSearch(kReTmpltHdLiteral) && !context["in_tmplt"] ?
                    [&] {
                        context["in_tmplt"] = true;
                        return Kind::TemplateHeadLiteral;
                    }                                                                                                          :
                reSearch(kReTmpltTlLiteral) && context["in_tmplt"]  ?
                    [&] {
                        context["in_tmplt"] = false;
                        return Kind::TemplateTailLiteral;
                    }                                                                                                          :
                reSearch(kReTmpltMdLiteral) && context["in_tmplt"]   ? [] { return Kind::TemplateMiddleLiteral; }              :
                reSearch(kReTmpltLiteral)                            ? [] { return Kind::TemplateLiteral; }                    :
                reSearch(kReIdentifierKyword)                        ? [&] { return toKind(matched.str(), Kind::Identifier); } :
                reSearch(kReOperatorBracket)                         ? [&] { return toKind(matched.str()); }                   :
                [] { return Kind::Unknown; }
            )(),
            matched.str(),
        });

        if (result.back().kind == Kind::Unknown) {
            std::wcerr << L"[ERR] : Unknown Token : " << source_code.substr(0, 40) << std::endl;
            // fwprintf(stderr, L"[ERR] : Unknown Token : %s", source_code.substr(0, 40).c_str());
            exit(1);
        }
    }

    return result;
}
// str_to_kind is a map(string to kind).
auto toKind(std::wstring str, Kind default_kind) -> Kind {
    return str_to_kind.count(str) ? str_to_kind.at(str) : default_kind;
}

Error is occurred here(reSearch(kReOperatorBracket) ? [&] { return toKind(matched.str()); }). And reason is error: incompatible operand types ('(lambda at ~/projects/lexer.cc:69:72)' and '(lambda at ~/projects/lexer.cc:70:17)').

I do not understand it because return type of toKind is Kind, so lambda function([&] { return toKind(matched.str())) must return Kind value, and lambda function([] { return Kind::Unknown; }) also has no choice but to return a value of type Kind.

At first I thought g++ couldn't interpret my code (in python, sometimes the interpreter can't interpret it if I write down the line I shouldn't write down). However, after many attempts, I found that it was not the main cause either.

Also, if the code was compiled, I would run gdb to debug it, but I couldn't do this either because it was a problem with the compilation process.

273K
  • 29,503
  • 10
  • 41
  • 64
csh
  • 13
  • 2
  • 2
    [This answer](https://stackoverflow.com/a/7477329/11365539) says that each lambda has a different type. Instead use `std::function`, but you should know that `std::function` has a considerable performance penalty. – Warpstar22 Aug 08 '23 at 17:24
  • The problem is not the return type of the lambdas, but the result type of the `?:` operator. The two options must have a common type, two lambdas do not. – BoP Aug 08 '23 at 17:43

1 Answers1

1

Since you use lambdas as operands in a ternary operation, you'll need to wrap those lambdas inside a std::function.

As you can see, all lambdas have their own distinct type that is not std::function. If you want to have an expression that return one type or another, you need some kind of type erasure.

Consider this code:

std::any test = rand() > 10 ? 42 : "a";

Even though you're constructing a std::any, both sides of the operands of the ternary are not the same type and are not convertible between each other.

This code works though:

std::any test = rand() > 10 ? std::any{42} : std::any{"a"};

However, you seems to call the function on the spot, in line in the expression. Considering that, you can also remove all single-statement lambdas, and call the multi statements inside the ternary.

Here's how it would look like if you call them on the spot:

result.push_back({
  (
    reSearch(kReNumLiteral)
      ? Kind::NumberLiteral
      : reSearch(kReStrLiteral)
        ? Kind::StringLiteral
        : reSearch(kReTmpltHdLiteral) && !context["in_tmplt"]
          ? [&] {
            context["in_tmplt"] = true;
            return Kind::TemplateHeadLiteral;
          }() // here!
          : reSearch(kReTmpltTlLiteral) && context["in_tmplt"]
            ? [&] {
              context["in_tmplt"] = false;
              return Kind::TemplateTailLiteral;
            }() // here again, lambda called.
            : reSearch(kReTmpltMdLiteral) && context["in_tmplt"]
              ? Kind::TemplateMiddleLiteral
              : reSearch(kReTmpltLiteral)
                ? Kind::TemplateLiteral;
                : reSearch(kReIdentifierKyword)
                  ? toKind(matched.str(), Kind::Identifier)
                  : reSearch(kReOperatorBracket)
                    ? toKind(matched.str())
                    : Kind::Unknown
  ),
  matched.str(),
});

Here's how it look like if you wrap them all in a type erasure wrapper like std::function:

result.push_back({
  (
    reSearch(kReNumLiteral)
      ? std::function{[]{ return Kind::NumberLiteral; }}
      : reSearch(kReStrLiteral)
        ? std::function{[]{ return Kind::StringLiteral; }}
        : reSearch(kReTmpltHdLiteral) && !context["in_tmplt"]
          ? std::function{[&] {
            context["in_tmplt"] = true;
            return Kind::TemplateHeadLiteral;
          }}
          : reSearch(kReTmpltTlLiteral) && context["in_tmplt"]
            ? std::function{[&] {
              context["in_tmplt"] = false;
              return Kind::TemplateTailLiteral;
            }}
            : reSearch(kReTmpltMdLiteral) && context["in_tmplt"]
              ? std::function{[]{ return Kind::TemplateMiddleLiteral; }}
              : reSearch(kReTmpltLiteral)
                ? std::function{[]{ return Kind::TemplateLiteral; }}
                : reSearch(kReIdentifierKyword)
                  ? std::function{[]{ return toKind(matched.str(), Kind::Identifier); }}
                  : reSearch(kReOperatorBracket)
                    ? std::function{[]{ return toKind(matched.str()); }}
                    : std::function{[]{ return Kind::Unknown; }}
  )(), // call the std::function that wraps a lambda
  matched.str(),
});
Guillaume Racicot
  • 39,621
  • 9
  • 77
  • 141
  • I think I lacked knowledge of Lambda in C++. Thank you for your kind information. Also, thank you for showing me a really elegant and neat code. I get a lot of inspiration. – csh Aug 09 '23 at 02:22