I was reading this book "ARM System Developers Guide" by Elsevier and I came across this:
The ARM instruction set differs from the pure RISC definition in several ways that make the ARM instruction set suitable for embedded applications:
Variable cycle execution for certain instructions — Not every ARM instruction executes in a single cycle. For example, load-store-multiple instructions vary in the number of execution cycles depending upon the number of registers being transferred. The transfer can occur on sequential memory addresses, which increases performance since sequential memory accesses are often faster than random accesses. Code density is also improved since multiple register transfers are common operations at the start and end of functions.
Any other ARM instructions you guys can point out which take variable cycles to execute?