How does GPT3 or other model goes from next word prediction to do Sentiment analysis, Dialogs, Summaries, Translation .... ?
what is the idea and algorithms ? How does it work ?
F.e. generating paragraph is generate next word then the next ..next..
On the other hand Sentiment analysis task is paragraph of text is Good/Bad, which is a classification ? Extracting meaningful sentence from paragraph is even more different task.
How do we go from next token to ...... !
Andre thanks for the replies.
It seems my question is not clear enough. So let me elaborate. Next-token prediction can be trained on normal text corpus.
word1 w2 w3 w4 .....
Next Sentiment can be trained on sentence=>marker=>label
sent1: word1 w2 w3 w4 ..... marker label1
sent2: word1 w2 w3 w4 ..... marker label2
sent3: word1 w2 w3 w4 ..... marker label3
....
It is no longer corpus-next-token-generation. It is next-token generation. The problem is you need to have the LABALED data !!
How about text summation ... lets use keyword extraction (and eventually sentence selection based on those keywords) Again u need even more complex labeling.
paragraph1 => kw1
paragraph1 => kw2
paragraph2 => kw3
paragraph3 => kw4
it still can be thought of as next-token prediction but you need again specialized LABELED data.
So my question given ONLY corpus text, how do you do the Sentiment, Text summary .... etc ?
Otherwise GPT3 is simply scaled DNN with thousands of man hours for labeling data !!
WHERE is the LEAP ?