RTOS is used to handle program complexity by placing unrelated tasks in different procedures, so that they can execute seemingly simultaneously. For example, it might make sense to split application logic, GUI and serial communication in 3 independent processes.
Whether this gives true multi-processing, or multi-processing simulation, depends on the number of CPU cores available. Traditionally, most RTOS are multi-processing simulation on single-core.
A state machine on the other hand, is a program design specification, which may or may not have the purpose of splitting up complexity. So it is not necessarily related to RTOS.
You can however design a "poor man's RTOS" as a manner of finite state machine, where you give each state a certain time slice and expect the state to be finished before it elapses (or the watchdog will bite). This can give the same real-time behavior as a RTOS, but there will only be one single stack and no "true" context switches.
Picking bare metal or RTOS depends a lot on the program complexity. Unless the original program design is state of the art (it rarely is), bare metal programs tend to become a pain when they grow up to somewhere between 50k-100k LOC. In these situations, picking a RTOS from the start would perhaps have been wiser.
On the other hand, if you don't think the program will ever grow that large, bare metal is so much easier to work with. RTOS introduces extra complexity, and extra complexity introduces bugs. The golden rule is to always keep software as simple as possible.