I'm using GPT-3 for some experiments where I prompt the language model with tests from cognitive science. The tests have the form of short text snippets. Now I'd like to check whether GPT-3 has already encountered these text snippets during training. Hence my question: Is there any way to sift through GPT-3's training text corpora? Can one find out whether a certain string is part of these text corpora?
Thanks for your help!