I'm trying to learn about the methods used in instruction level parallelism and the differences between them. My question here is, given an instruction set that was initially made to run at a processor without instruction level parallelism, which one of these methods can be used in order to achieve instruction level parallelism on a new processor and why/how. The new processor will execute the same instruction set and run the same program binaries identical to the original one, but the performance will be better. The options are:
1)Out-of-order execution(Tomasulo Algorithm)
2)Pipelining
3)Superscalar
4)VLIW