Depends on how you evaluate the program. E.g., for the bloxbatch utility, try the "-logLevel debugDetail@factbus" flag to get a full trace/profile of all rule evaluation. This tells you exactly how the joins are performed (i.e., the result of query optimization). If you want to summarize this profile to see which rules end up taking the most time, you can use Thiago Bartolomei's python script LogAnalyzer.py (e.g., publicly available in the Doop framework, here: https://bitbucket.org/yanniss/doop/src/9daaea0b582674603abb2f3e43f73f630ee6d3e1/bin/LogAnalyzer.py).
I don't think there is a general way to compute a fact's provenance, i.e., the set of rules whose evaluation produced a specific derived fact, although experimental facilities for this have existed in the past.