Advance Computer Architecture

Delayed Branching

When branches are processed by a pipeline simply, after each taken branch, at least one cycle remains unutilized. This is because of the assembly line-like apathy of pipelining. Instruction slots following branches are known as branch delay slots.

Delay slots can also appear following load instructions; these are defined load delay slots. Branch delay slots are wasted during traditional execution. However, when delayed branching is employed, these slots can be at least partly used.

Principle of Delayed branching

In the figure, it can transfer the add instruction of our program segment that initially preceded the branch into the branch delay slot. With delayed branching, the processor implements the add instruction first, but the branch will only be efficient later. Thus, in this example, delayed branching keep the initial execution sequence − 

eg:- add r1, r2, r3; 
          b loop; /*unconditional branch*/
          loop: sub

Conditional branches cause the same or higher delays during an easy pipelined execution. This is because of the additionally needed operation of checking the particular condition



Accordingly, instruction in the delay slot of an untaken branch will always be executed. Branching to the target instruction (sub) is executed with one pipeline cycle of delay. This cycle is used to execute the instruction in the delay slot (add). Thus delayed branching results in the following execution sequence −

Modified Exmple: 
            b loop; /*unconditional branch*/ 
               add r1, r2, r3; loop: sub 

Disadvantage of Delayed Branching

There are various disadvantages of delayed branching which are as follows −

  •  Delayed branching requires a redefinition of the architecture. 
  •  Delayed branching gives rise to a slight code expansion due to the NOPs to be inserted. For instance, it would have to insert 100∗fb∗(1−ff)=100∗ 0.2∗(1−0.6)=8NOPs per 100 instructions and thus would have 8% longer code than without delay branching.
  •  Interrupt processing becomes more difficult. This is because interrupt requests caused by instructions in the delay slot have to be processed differently from those arising from ‘normal’ instructions. When a delay slot instruction initiates an interrupt, the preceding instruction namely the conditional branch has already been fetched but not yet processed. This situation is quite different from that which occurs in traditional instruction processing where all instructions preceding an instruction that causes an interrupt has already been completed. 
  • Additional hardware is required to implement delayed branching
Also Read 👇

Dynamic interconnection Networks


Comments