Our C++ Test Coverage tool provides test coverage on template bodies, or at least those templates that are defined in files you specify for it to cover.
It doesn't distinguish instantiations of templates.
If you have a multi-threaded application, the tool will record the branches executed by all threads, if you configure the tool to use flags that are atomically writable (typically the natural word size of the CPU [32 or 64 bits]. (If you don't do this, you may end up with a thread race in updating the coverage flags and you can lose a bit of coverage. This isn't a defect of the tool; its a consequence of unsynchronized access to the storage holding probe data.)
For race detection, OP needs to find a race detection tool; test coverage tools won't do this.