FLAN-T5's ability to generate summaries for the dialogues in the dialogsum dataset is a result of its multitask fine-tuning process. During multitask fine-tuning, FLAN-T5 has been trained on a diverse range of tasks, including summarization, review rating, code translation, and entity recognition, among others. This training involves providing the model with examples and instructions for each task, guiding it on how to respond appropriately.
In the case of the dialogsum dataset, the fine-tuning process has taught FLAN-T5 to recognize and respond to prompts or instructions that specifically ask for a summary of a given conversation. The fine-tuning dataset likely contains numerous examples where the model has learned to generate summaries based on prompts like "Summarize the conversation," "Briefly summarize the dialogue," or other similar phrasings.
These instructions are repeated across the training data, allowing the model to associate such prompts with the task of generating summaries. As a result, when presented with a conversation from the dialogsum dataset and a prompt that explicitly asks for a summary, FLAN-T5's fine-tuned knowledge directs it to perform the summarization task rather than any other task it might have learned during the multitask fine-tuning process.
In essence, FLAN-T5's ability to generate summaries on the dialogsum dataset is a product of its training history and the consistent reinforcement of summarization prompts during the fine-tuning process. This targeted training ensures that FLAN-T5 is capable of responding appropriately to instructions related to generating summaries, even when presented with conversations it hasn't encountered before.