TLB Designs for Short Cycle Times

Gerald Luiz
EE385b Seminar
12:15 January 29, 1997

TLB's are critical for overall system performance. A TLB miss fault requires a trap to the OS and possibly several accesses to non-cached memory. Furthermore, the misses cannot be hidden like in a non-blocking cache. As the latency of memory increases, TLB misses become even more expensive.

Traditional TLB structures use a CAM to hold the virtual page number (VPN) and an SRAM to hold the physical or real page number (RPN) translation. The fully associative design maximizes hit rates for a given number of entries at the expense of area and power. Current designs use as many entries as possible, using every nanosecond available. However, as cycle times decrease, the number of entries may have to be decreased to maintain single cycle accesses. The design is also difficult to pipeline. Hence, this arrangement will actually lead to reduced performance in the future.

This talk will present alternative TLB designs that will be able to function in reduced cycle time environments by moving to a SRAM + SRAM design. Techniques such as hashing, pseudo associativity, and victim TLBs are introduced to maintain system performance. These designs are also more area and power efficient and can be pipelined.