8

After some effort, I convinced both the clang compiler and clang-tidy (static analyzer) to warn of a use-after-move situation. (see https://stackoverflow.com/a/74250567/225186)

int main(int, char**) {
    a_class a;
    auto b = std::move(a);
    a.f();  // warns here, for example "invalid invocation of method 'f' on object 'a' while it is in the 'consumed' state [-Werror,-Wconsumed]"
}

However, if I make the variable global (or static or lazily static), there is no more warning.

a_class a;

int main(int, char**) {
    auto b = std::move(a);
    a.f();  // no warns here!
}

See here: https://godbolt.org/z/3zW61qYfY

Is it possible to generalize some sort of use-after-move detection at compile-time for global variables? Or is it impossible, even in principle?


Note: please don't make this discussion about global object (I know it is a bad idea) or about the legality of using moved objects (I know some class are designed for that to be ok). The question is technical, about the compiler and tools to detect a certain bug-prone pattern in the program.


Full working code, compile with clang ... -Wconsumed -Werror -std=c++11 or use clang-tidy. The clang annotation (extensions) help the compiler detect the patterns.

#include<cassert>
#include<memory>

class [[clang::consumable(unconsumed)]] a_class {
    std::unique_ptr<int> p_;

public:
    [[clang::callable_when(unconsumed)]]
    void f() {}

    // private: [[clang::set_typestate(consumed)]] void invalidate() {}  // not needed but good to know
};

a_class a;

int main(int, char**) {
    // a_class a;
    auto b = std::move(a);
    a.f();  // global doesn't warn here
}

Most of the information I could find about this clang extension is from here: Andrea Kling's blog https://awesomekling.github.io/Catching-use-after-move-bugs-with-Clang-consumed-annotations/

alfC
  • 14,261
  • 4
  • 67
  • 118
  • 4
    There is absolutely nothing wrong, whatsoever, with using a moved-from object. The C++ library guarantees that moved-from objects are left in a "valid but unspecified state", with emphasis on "valid". What's valid and well-formed for objects that are not moved, is also valid for a moved-from object. I would consider such a diagnostic, in absence of real undefined behavior, to be more of a hindrance than help. Please show a [mre], that everyone else in the world can cut/paste, compile and run, that you claim produces this diagnostic message? Everything is speculation until then. – Sam Varshavchik Oct 31 '22 at 16:18
  • 3
    In simple cases (like above) it's is possible to detect. However with multiple compilation units, passing variables into external functions etc it can't be done. Given that programs with more than a single cpp files are the norm, it's probably not worth the time of the compiler developers. – Richard Critten Oct 31 '22 at 16:18
  • I'd recommend either make the global const or remove/hide the move constructor from the client code – The Dreams Wind Oct 31 '22 at 16:22
  • 3
    @SamVarshavchik: That sounds wrong. Consider a `std::unique_ptr` and `ptr->Foo()`, which is obviously wrong (in the UB sense) after a move. I am not sure what you intended, but I doubt it was this. – MSalters Oct 31 '22 at 16:22
  • 3
    It is wrong because of a lack of a nullptr check, @MSalters, not because of use after move. It would be equally wrong to do this for a not moved-from `unique_ptr` which is also a nullptr. – Sam Varshavchik Oct 31 '22 at 16:24
  • @MSalters It's not that a `std::` moved from object still contains what's been moved. It's guaranteed to be in its empty state, ready to be used as is. It means you can use a `std::string` as an empty string directly after a move - no `clear()` needed. – Ted Lyngmo Oct 31 '22 at 16:25
  • Global variable which changes state is invitation for bugs and entangled code. – Marek R Oct 31 '22 at 16:26
  • 4
    Unfortunately, @TedLyngmo, a `clear()` is needed because you have no guarantees, whatsoever, that a moved-from `std::string` will be empty. If a short-string optimization is implemented, the moved-from string may be completely unchanged. – Sam Varshavchik Oct 31 '22 at 16:26
  • @SamVarshavchik Darn ... I was told that the standard library classes all gave this extra guarantee. – Ted Lyngmo Oct 31 '22 at 16:27
  • 6
    The guarantee from the C++ library is "valid, but unspecified state". Every one of these four words have a very specific, narrow, strict meaning. Here, "unspecified" means exactly that. A practical joker can design a C++ library whose moved-from `std::string`s hold the complete contents of "Lord Of The Rings" (in static storage, with some kind of a "copy on write" flag set). This would be completely compliant with the C++ standard. – Sam Varshavchik Oct 31 '22 at 16:30
  • 1
    @SamVarshavchik: True, unless a specific type gives a stronger guarantee (e.g. `unique_ptr` does). I think you missed an Universal quantification somewhere: what's valid for **all** objects that aren't moved is also valid for **every** object that's moved-from. This explains `ptr->Foo()`, which is **invalid** when `ptr` is null. – MSalters Oct 31 '22 at 16:35
  • @SamVarshavchik , if the godbolt link doesn't count as MWE I added the code to the question. – alfC Oct 31 '22 at 17:06
  • Yes, just move to rust :)) – OrenIshShalom Oct 31 '22 at 17:07
  • @MarekR , I am absolutely aware of " Global variable which changes state is invitation for bugs and entangled code.". This is actually to wrap a library with global state and make it less error-prone. – alfC Oct 31 '22 at 17:09
  • 1
    @alfC From playing around with `-Wconsumed`, it seems that the implementation is quite incomplete (as also noted in the [documentation](https://clang.llvm.org/docs/AttributeReference.html#consumed-annotation-checking)). Clang is unable to diagnose the issue if even [a single indirection](https://godbolt.org/z/dWaWWETE1) is involved. So, a workaround might be to always use a non-reference local variable by means of a proxy and use a global getter rather than a variable, e.g. [like this](https://godbolt.org/z/cK1oKGT64). Would this be viable in your use case? – Sedenion Nov 29 '22 at 09:37
  • @Sedenion, nice approach, but I guess the problem persist because I can still make another proxy object by calling ` proxy A = get_a_class();` later in main. I tried modifying the code but there is still not detection that the actual global object is moved (i.e. consumed) at the start. Here it is my (failed attempt) https://godbolt.org/z/bcnoeKe7r – alfC Nov 29 '22 at 09:49
  • @alfC Yes, but attempting to detect this becomes impossible at the latest when you access the global from more than one function. I.e. in [this code](https://godbolt.org/z/GsGe7b5Wb), `a.f()` is only invalid if `test1()` was called before `test2()`. In general, there is no way for the compiler to know this. So I'd argue that the best you can do in C++ is to warn about use-after-move only within individual functions, but never across functions. So, in the proxy approach, you should call `get_a_class()` only once per function. – Sedenion Nov 29 '22 at 09:56
  • @Sedenion, I did some experiments earlier and the clang feature was able to see consumed objects across functions. (e.g. moved inside a function, used later after the function). I know it could get eventually confused, but it works in principle across functions. So, using inside a single function is not the problem. The only limitation that I found is that the object cannot be global for it to work (either static inside a function or just a true global variable.) I think you have good arguments not to abuse or rely on this feature (yet). – alfC Nov 29 '22 at 10:21
  • 1
    It also can't be a [reference in certain cases](https://godbolt.org/z/9f47ePbo5), making any workarounds I can think of invalid. – Sedenion Nov 29 '22 at 10:25
  • @Sedenion, also notice that the feature works through indirection https://godbolt.org/z/nqnd4eTa1 , at least partially, _usage_ through references seems to trip it though. So yep, globals and references seem to be a limitation. Usually I don't fiddle with experimental features but I feel this one is a game changer for C++ and perhaps its only lifeline to survive Rust. – alfC Nov 29 '22 at 10:25
  • 1
    Note also that the guarantee "valid, but unspecified state" is for standard library types. – Jason Nov 29 '22 at 11:00
  • yes, that also, but it would be useful to mark variables with arbitrary types as consumed, including standard library types, not just types marked use `clang::consumable`. I think this eventually will come to standard C++. – alfC Nov 29 '22 at 11:09
  • @sSdenion To me it seems that if a function _leaves_ a global variable in an unspecified state that is likely a bug in the program. So, in addition to detecting use after move inside a function I would be satisfied with diagnostics for move-ing the global variable and not setting the state afterwards in _that function_. – Hans Olsson Nov 30 '22 at 08:30
  • @HansOlsson, exactly, that is why I thought it could be a good idea to detect it by the compiler. – alfC Dec 01 '22 at 06:51
  • @alfC So, considering all the comments above, I guess the answer to your question is unfortunately that it is simply not implemented. The [most recent commit to LLVM that actually improved the warning is apparently from 2015](https://github.com/llvm/llvm-project/commits/b0561b3346e7bf0ae974995ca95b917eebde18e1/clang/lib/Analysis/Consumed.cpp), i.e. 7 years old. So I guess there is no broader interest in improving things any time soon. – Sedenion Dec 03 '22 at 16:23
  • 2
    The code comments state that the analysis for checking consumed properties is intra-procedural. You may be able to use the clang static analyzer for CTU (cross translation unit) analysis (why I'm writing this as a comment and not an answer). Unfortunately, I can't find a flag such as -lwhole-program, which might inform clang to allow info about globals to be found. It seems like the ball gets dropped in ConsumedStmtVisitor::checkCallability(...) where VarState of CS_None dodges the warning call as well as a matching state. My guess is globals have CS_None as their state at that point. – Mark H Dec 04 '22 at 00:16
  • 1
    Also, it's important to note that in your godbolt example, if you declare 'a' locally and then remove the 'auto b' line (such that you declare a and then attempt to call a.f()), you'll still have an 'invalid invocation' error, because you need a constructor for a_class that uses return_typestate(unconsumed) – Mark H Dec 04 '22 at 00:20
  • @MarkH thank you for your observations. i am ready to accept your technical llvm explanation as an answer. – alfC Dec 04 '22 at 00:59
  • @MarkH, I think the annotated default constructor is not necessary because how the class was declared as `[[clang::consumable(` **unconsumed** `)]]`. But I could be wrong. – alfC Dec 04 '22 at 10:02
  • Two interesting things: 1) This comment that it's unmaintained, so probably unfit for use: https://github.com/llvm/llvm-project/issues/57766#issuecomment-1251222959 2) I tried keeping a static class unordered_set of pointers to instances, but it appears the consumed state logic couldn't follow calling methods that way like it can a direct a.invalidate() call. – Mark H Dec 05 '22 at 07:06
  • 1
    clang also seems to behave similarly with respect to uninitialized global variables, ex: https://godbolt.org/z/f4aKo3r3j . I believe the uninitialized analysis is fed in the same way as the consumed analysis. Can't find the fundamental reason why globals are excluded, though. Looking at the AST, I can see they are in the same TranslationUnitDecl, but global variables are declared in VarDecl's outside of any FunctionDecl. – Mark H Dec 05 '22 at 07:39
  • @MarkH, thanks for your comments. Your investigation is the closest to an answer here. i.e. the analysis is in principle possible but not currently implemented for globals in clang for some reason (maybe it is hard to implement to the point that is mildly useful). – alfC Dec 05 '22 at 08:06
  • @MarkH, I think a global variable is always initialized (e.g. to zero) because its value is contained in the binary. That is why there no warning for the global `x` in your example. I think that is how compilers work if I am not mistaken, – alfC Dec 05 '22 at 08:08

1 Answers1

0

You could use cppcheck with the performance and style checks. With cppcheck --enable=performance,style file.cpp you could check your file.

For example in the following code (simple use-after-move):

#include <utility>

// Declare a global variable
int globalVariable = 0;

int main() {
  // Move the global variable to a local variable
  int localVariable = std::move(globalVariable);

  // Access the global variable after it has been moved
  return globalVariable;
}

it would yield [my_file.cpp:8]: (performance) Access to moved-from object 'globalVariable'. and therefore detecting the use-after-move case! :)

J. M. Arnold
  • 6,261
  • 3
  • 20
  • 38