What you are asking for, belongs to the domain of Confidence Estimation, nowadays (within the Machine Translation (MT) community) better known as Quality Estimation, i.e. "assigning a score to MT output without access to a reference translation".
For MT evaluation (using BLEU, NIST or METEOR) you need:
- A hypothesis translation (MT output)
- A reference translation (from a test set)
In your case (real-time translation), you do not have (2). So you will have to estimate the performance of your system, based on features of your source sentence and your hypothesis translation, and on the knowledge you have about the MT process.
A baseline system with 17 features is described in:
- Specia, L., Turchi, M., Cancedda, N., Dymetman, M., & Cristianini, N. (2009b). Estimating the sentence level quality of machine translation systems. 13th Conference of the European Association for Machine Translation, (pp. 28-37)
- Which you can find here
Quality Estimation is an active research topic. The most recent advances can be followed on the websites of the WMT Conferences. Look for the Quality Estimation shared tasks, for example http://www.statmt.org/wmt17/quality-estimation-task.html