3

I'm trying to figure out if modern GPUs have a reduced instruction set, or a complex instruction set.

Wikipedia says that it's not the size of the instruction set, rather how many cycles it takes to complete an instruction.

In RISC processors, each instruction can be completed in one cycle.

In CISC processors, it takes several cycles to complete some instructions.

I'm trying to figure out what the case is for modern GPUs.

nabeelr
  • 169
  • 1
  • 6
  • Bad definition, many instructions like division can take variable number of instructions. Also the meaning of RISC vs CISC is rather trivial, also RISC processors are somewhat CISC. By calling it RISC what are you really trying to say? – Mikhail Nov 24 '13 at 00:33
  • Most commands execute in a pipe line so they take multiple click cycles in hardware but look like one to the outside world. – Ethan Nov 24 '13 at 00:35
  • Well, RISC processors usually have very short pipelines don't they? The difference between RISC and CISC is that each instruction is more simplified, and runs in a shorter pipeline allowing the instruction to get completed in fewer cycles. [Source](http://www.seas.upenn.edu/~palsetia/cit595s07/RISCvsCISC.pdf) Am I misunderstanding this? – nabeelr Dec 03 '13 at 19:53

3 Answers3

2

If you mean Nvidia then it's clearly RISC as its most GPUs don't even have integer division and modulo operations in hardware, only shifts, bitwise operations and 3 arithmetic operations (addition, subtraction, multiplication) are used to implement those 2. I can't find example but this question (modular arithmetic on the gpu) shows that mod uses

procedure which implements some sophisticated algorithm (about 50 instructions or even more)

Even NVVM (Nvidia virtual machine) language called PTX uses more operations some of which are "baked" into a bunch of simpler operations anyway after conversion to one of native languages (there are different versions of such languages because of nature of GPUs and their generations/families but those are just called SASS altogether).

You can see here all the available operations along with description on each which are yet very short and not very clear (especially if you don't have background in machine level programming like knowing that "scaled" means 1 left shifted to operand just as in x86's "FSCALE" or "Scale factor" etc.): https://docs.nvidia.com/cuda/cuda-binary-utilities/index.html#instruction-set-ref

If you mean AMDGPU then there is a lot of instructions and it's not so clear because some sources tell that they switched from VLIW to something just when Southern Islands GPUs were released.

Danil
  • 41
  • 4
0

RISC instruction set : the load/store unit is independent from other units so basically for loading and storing specific instruction are used

CISC insruction set : the ad/store unit in embedded in the instrction execution routine , therfore the instruction is more comlex than RISC instruction because CISC instruction beside the operation it will perform the load and store stage and this require more transistor logic to be used for one ibstruction

dhokar.w
  • 386
  • 3
  • 6
0

The goal of CISC was to take common coding patterns and accelerate them in hardware. You see this in the constant extensions to the base architecture. See Intel's MMX and SSE, and AMD's 3DNow!, etc. https://en.wikipedia.org/wiki/Streaming_SIMD_Extensions This also makes for good marketing, as you need to upgrade to the new processor to accelerate the newest common tasks, and keeps coders busy constantly translating their code patterns to the new extensions.

The goal of RISC was the opposite. It tried to perform few base functions as fast as possible. The coder then needs to continue to break down their common coding tasks to those simple instructions (although high-level programming languages and code packages/libraries accomplish this for you). RISC continues to survive as the architecture for ARM processors. See: https://en.wikipedia.org/wiki/Reduced_instruction_set_computer

I note that GPUs are similar to the RISC philosophy, in that the goal is to perform as many relatively simple computations as fast as possible. The move toward deep learning created a need for training millions of relatively simple parameters, hence the move back toward a highly parallel, relatively simple architecture. Having both philosophies implemented inside your computer is the best of both worlds.

Maddenker
  • 179
  • 2
  • 7
  • The key point of RISC is making it easy to pipeline. SIMD like MMX and SSE isn't an obstacle to that; those instructions are designed to be run as a single operation by a wide execution unit. Just like ARM NEON SIMD or PowerPC Altivec, or RISC-V vector extensions. CISC has things like `add reg, [mem]` that require two different execution units (load and ALU) for the same instruction. *That's* something RISCs (and GPUs) avoid. GPUs handle SIMD differently from CPUs, having many pipelines instead of wide execution units in a single pipeline, but that's separate from being RISCy. – Peter Cordes Jan 03 '23 at 10:43