1

I'm trying to understand the steps that it takes to go through an instruction and their relationship with each oscillator cycle. The datasheet of the PIC18F4321 seems to divide this process into 2 basic steps: fetch and execution. But it does not seem to be consistent when saying which step belongs to which oscillator cycle. For example, it says:

Internally, the program counter is incremented on every Q1; the instruction is fetched from the program memory and latched into the Instruction Register (IR) during Q4.

This sounds odd, because it didn't mention Q2 and Q3. From this alone I would almost be led into thinking that fetching takes 1 oscillator cycle, since it happens in Q4. But reading just a little further, it says that:

The instruction fetch and execute are pipelined in such a manner that a fetch takes one instruction cycle, while the decode and execute take another instruction cycle. However, due to the pipelining, each instruction effectively executes in one cycle.

So now it is telling me that fetching takes Q1 through Q4. Based on that, I would assume that if it were not for pipelining, instructions would take 2 instructions cycles to go through, since a fully instruction cycle is for fetching alone. But I understand how in practice pipelining would make it seem like it only takes 1 instruction cycle to go through an instruction. 

Still a little bit further, and I believe this is the most confusing part, it says that:

In the execution cycle, the fetched instruction is latched into the Instruction Register (IR) in cycle Q1. This instruction is then decoded and executed during the Q2, Q3 and Q4 cycles. Data memory is read during Q2 (operand read) and written during Q4 (destination write).

Based on this and other sources I have read, it seems like it divides the execution part into decoding, reading, processing and writing (it confuses me because it keeps using the word execution when I don't think it's actually referring to the execution portion of "fetch and execution").

1) Now, when does it do each? It is very clear when it says that read/write will happen in Q2/Q4. So Q3 should be processing?

2) What is the oscillator cycle for decoding?

3) Why do you have to latch the instruction to IR again in Q1 if you just did that in Q4 when you fetched for this same instruction?

somename
  • 23
  • 4
  • What are `Q1/Q2/Q3/Q4`? Are those names for clock cycles? There's not even a link to the datasheet you're quoting. It sounds like it might be a 4 or 5 stage pipeline, unless it's clear that instructions have single-cycle latency. (effective latency as part of a dependency chain is a lot shorter than the full pipeline length on more complex CPUs. The branch-mispredict penalty shows the length of the pipeline ahead of the stage where branch instructions execute.) – Peter Cordes Oct 19 '15 at 01:59
  • Sorry, I thought that stating the name of the microcontroller was enough: http://ww1.microchip.com/downloads/en/DeviceDoc/39689b.pdf. Yes, they are name for clock cycles. From the datasheet, the clock input is divided into 4, non-overlapping clocks (Q1 through Q4, which is equivalent to 1 instruction cycle) – somename Oct 19 '15 at 02:37
  • I don't write PIC asm normally, and wasn't sure google would find the right datasheet. But ok, so Q1 through Q4 are quarters of a clock cycle. It sounds to me like some of the language is referring to whole input clocks, while other language is referring to cycles of the quad-multiplied clock. – Peter Cordes Oct 19 '15 at 02:53

1 Answers1

0

disclaimer: I've never written PIC asm code, let alone done any performance analysis of a PIC. I mostly know about more powerful CPUs, like x86, from reading http://agner.org/optimize/, and stuff on http://realworldtech.com/. This answer is just based on the snippets of the manual you put in your question, because they do make sense to me. I might be completely misinterpreting something.


So in terms of the external clock, it's a 2 cycle pipeline (fetch|execute), with a quad-pumped clock in the execution core. The execution stage is subdivided into 4 pipelined stages. A bit like how Pentium4 had double-pumped execution units (i.e. one pipeline stage that uses a faster clock).


  1. It sounds like yes, instruction execution happens in Q3.

2) What is the oscillator cycle for decoding?

I don't understand the question. It decodes one instruction per input clock, using the unmultiplied clock.

3) Why do you have to latch the instruction to IR again in Q1 if you just did that in Q4 when you fetched for this same instruction?

It sounds like the PC is incremented in Q1, so during instruction execution it points to the next instruction. In Q4, that next instruction is done being fetched into IR in preparation for executing it next cycle. This is the instruction data itself (i.e. what PC is pointing to). I'm not sure about this part, but this makes sense.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847