In "Computer Organization and Design: The Hardware/ Software Interface, Sixth Edition" RISCV Edition by David A. Patterson and John L. Hennessy chapter 6.4, it says about "coarse-grained multithreading":
This change relieves the need to have thread switching be extremely fast and is much less likely to slow down the execution of an individual thread, since instructions from other threads will only be issued when a thread encounters a costly stall.
Because a processor with coarse-grained multithreading issues instructions from a single thread, when a stall occurs, the pipeline must be emptied or frozen. The new thread that begins executing after the stall must fill the pipeline before instructions are able to complete.
But about "Fine-grained multithreading", it doesn't refer to changes to pipeline when switching threads:
This interleaving is often done in a round-robin fashion, skipping any threads that are stalled at that clock cycle.
Q: Since the book says:
A thread includes the program counter, the register state, and the stack.
and both categories of multithreading begins switching threads when encountering stalls, why must Coarse-grained multithreading need pipeline be empty because pipeline instruction source is only from a single thread and then fill the pipeline but "Fine-grained multithreading" not?