1

I am working on a problem where I am attempting to create different scenarios in different C programs such as

  • Data Hazard
  • Branch Evaluation
  • Procedure Call

This is in an attempt at learning pipelining and the different hazards that come up.

So I am writing simple C programs and disassembling to assembly language to see if a hazard gets created. But I cannot figure out how to create these hazards. Do yall have any idea how I could do this? Here is some of the simple code I have written.

I compile using.

gcc -g -c programName.c -o programName.o
gcc programName.o -o programName
objdump -d programName.o > programName.asm

Code:

#include <stdio.h>
int main()
{
    int i = 0;
    int size = 5;
    int num[5] = {1,2,3,4,5};
    int sum=0;
    int average = 0;

    for(i = 0; i < size; i++)
    {
        sum += num[i];
    }

    average=sum/size;

    return 0;
}

...and here is the assembly for that.

average.o:     file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <main>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   c7 45 f0 00 00 00 00    movl   $0x0,0xfffffffffffffff0(%rbp)
   b:   c7 45 f4 05 00 00 00    movl   $0x5,0xfffffffffffffff4(%rbp)
  12:   c7 45 d0 01 00 00 00    movl   $0x1,0xffffffffffffffd0(%rbp)
  19:   c7 45 d4 02 00 00 00    movl   $0x2,0xffffffffffffffd4(%rbp)
  20:   c7 45 d8 03 00 00 00    movl   $0x3,0xffffffffffffffd8(%rbp)
  27:   c7 45 dc 04 00 00 00    movl   $0x4,0xffffffffffffffdc(%rbp)
  2e:   c7 45 e0 05 00 00 00    movl   $0x5,0xffffffffffffffe0(%rbp)
  35:   c7 45 f8 00 00 00 00    movl   $0x0,0xfffffffffffffff8(%rbp)
  3c:   c7 45 fc 00 00 00 00    movl   $0x0,0xfffffffffffffffc(%rbp)
  43:   c7 45 f0 00 00 00 00    movl   $0x0,0xfffffffffffffff0(%rbp)
  4a:   eb 10                   jmp    5c <main+0x5c>
  4c:   8b 45 f0                mov    0xfffffffffffffff0(%rbp),%eax
  4f:   48 98                   cltq   
  51:   8b 44 85 d0             mov    0xffffffffffffffd0(%rbp,%rax,4),%eax
  55:   01 45 f8                add    %eax,0xfffffffffffffff8(%rbp)
  58:   83 45 f0 01             addl   $0x1,0xfffffffffffffff0(%rbp)
  5c:   8b 45 f0                mov    0xfffffffffffffff0(%rbp),%eax
  5f:   3b 45 f4                cmp    0xfffffffffffffff4(%rbp),%eax
  62:   7c e8                   jl     4c <main+0x4c>
  64:   8b 55 f8                mov    0xfffffffffffffff8(%rbp),%edx
  67:   89 d0                   mov    %edx,%eax
  69:   c1 fa 1f                sar    $0x1f,%edx
  6c:   f7 7d f4                idivl  0xfffffffffffffff4(%rbp)
  6f:   89 45 fc                mov    %eax,0xfffffffffffffffc(%rbp)
  72:   b8 00 00 00 00          mov    $0x0,%eax
  77:   c9                      leaveq 
  78:   c3                      retq   

Would appreciate any insight or help. Thanks!

diaz994
  • 354
  • 3
  • 13
  • 1
    What do you mean by "data hazard"? For DSP chips, this is running two data sources onto an internal bus when only one is allowed; you get it because DSP chips have as little as possible in them, and that forces you, the programmer, to avoid such problems, For x86 chips... they have tons of transistors; I don't think they will allow a data hazard to occur. Under the assumption that the GCC compiler is pretty good, why do think it will generate code that has a data hazard defined this way? – Ira Baxter Oct 13 '14 at 03:16
  • Your other possible definition are two parallel computations that pass data in an unsafe way. A single C program isn't parallel, so can't have this problem. What exactly do you think you are hoping to find? – Ira Baxter Oct 13 '14 at 03:19
  • That is exactly what I am trying to figure out. I am forcefully trying to create a data hazard such as ADD R1, R4, R5 SUB R6, R2, R1 there i have created a data hazard but I cannot recreate one with the gcc compiler. its for a computer architecture course. – diaz994 Oct 13 '14 at 03:28
  • 1
    "Exactly"? Which definition? I suggested *two*. Your response confuses; you show x6 code, and then you talk about an instruction ADD R1, R4, R5 which is not an x86 instruction, so you act like you are talking about two separate things. Are you clear these are separate things? You aren't going to get a hazard in the x86 instruction set. You need to be clearer about your problem formulation. – Ira Baxter Oct 13 '14 at 04:28

1 Answers1

1

Since this is homework, I'm not going to give you a straight answer, but some food for thought to push you in the right direction.

x86 is a terrible ISA to be using to try and comprehend pipelining. A single x86 instruction can hide two or three side-effects, making it difficult to tease out how a given instruction would perform in even the simplest of pipelines. Are you sure you're not provided a RISC ISA to use for this problem?

Put your loop/hazard code into a function and preferably randomize the creation of the array. Make the array much longer. A good compiler will basically figure out the answer otherwise and remove most of the code you wrote! For reasons I don't understand it's putting your variables in memory.

A good compiler will also do things such as loop unrolling in attempt to hide data hazards and get better code scheduling. Learn how to defeat that (or if you can, give the flag the compiler telling it to NOT do those things if messing around with the compiler is allowed).

The keyword "volatile" can be very helpful in telling the compiler to not optimize around/away certain variables (it tells the compiler this value can change at any moment, so don't be clever and optimize code with it and also don't keep the variable inside the register file).

A data hazard means the pipeline will stall waiting on data. Normally instructions get bypassed just in time, so no stalling occurs. Think about which types of instructions may not be able to be bypassed and could cause a stall on a data hazard. This is dependent on the pipeline, so code that stalls for a specific processor may not stall for another. Modern out-of-order Intel processors are excellent at avoiding these stalls and compilers are great at re-scheduling code so they won't occur even on an in-order core.

Chris
  • 3,827
  • 22
  • 30