The ground truth we know is used to re-train the NLC or R&R.
The ground truth is a question level training data.
e.g.
"How hot is it today?,temperature"
The question "how hot is it today?" is therefore classified to "temperature" class.
Once the application is up, real user questions will be received. Some are the same (i.e. the question from the real users are the same to the question in the ground truth), some are similar terms, some are new questions. Assume the application has a feedback loop to know whether or not the class (for NLC) or answer (for R&R) are relevant.
About the new questions, the approach seems to just add the them to the ground truth, which is then used to re-train the NLC/R&R?
For the questions with similar terms, do we just add them like the new questions, or do we just ignore them, given that similar terms can also be scored well even similar terms are not used to train the classifier?
In the case of the same questions, there seems nothing to do on the ground truth for NLC, however, to the R&R, are we just increase or decrease 1 for the relevance label in the ground truth?
The main question here is, in short, about what the re-training approach is for NLC & R&R...