Instruction level parallelism

ability of computer instructions to be executed simultaneously with correct results

Instruction-level parallelism (ILP) is a measure of how many operations in a computer program can be performed "in-parallel" at the same time (termed:"simultaneously"). Consider the following program:

1. e = a + b
2. f = c + d
3. g = e * f

Operation 3 depends on the results of "e" and "f" which are calculated from operations 1 and 2, so "g" cannot be calculated until both of "e" and "f" are computed. However, operations 1 and 2 do not depend on any other operation, so they can be computed simultaneously. If we assume that each operation can be completed in one unit of time then these three instructions can be completed in a total of two units of time, giving an ILP factor of 3/2; which means 3/2 = 1.5 greater than without ILP.

One of the goals of compilers and processors designers is to use as much ILP as possible. Ordinary programs are written execute instructions in sequence; one after the other, in the order as written by programmers. ILP allows the compiler and the processor to overlap the execution of multiple instructions or even to change the order in which instructions are executed.

How much ILP exists in programs depends on the application type, for example, in graphics and scientific applications the amount can be very large while in cryptography the amount much less.

Micro-architectural techniques that use ILP include:

In recent years, ILP techniques have been used for performance improvements in conditions where the difference between processor operating frequencies and memory access times is large. As of 2008, a cache "miss" costs several hundreds of CPU cycles in a main memory access; with much longer latency compared when the processor finds that the memory location is in the cache. Hence, this technique was proved to be insufficient to save the CPU time from waiting for the off-chip data. Instead, the industry is moving towards improving higher levels of parallelism using techniques such as multiprocessing and multithreading. [1]

References change

Other websites change