Andrew Zimmerman
7 October 1998
Cache Organizations for On-Chip Multiprocessors
 
As feature sizes become smaller and chip area increases, it is becoming
increasingly feasible to integrate multiple processors onto a single chip.
The additional fixed area constraint of a single chip multiprocessor
imposes a number of design choices and tradeoffs on the implementation.
This talk presents results of studies performed to investigate some of
these tradeoffs.
 
The foremost tradeoff is the balancing of the computational performance of
multiple processors against the performance of the memory system hierarchy.
Results are presented that show for low latency memory hierarchies,
dedicating chip area to additional computational elements, as opposed to
larger caches, results in increased system performance.
 
Based on these results, the cache organization of a dual processor cluster
based multiprocessor are presented. The results show that both
shared caches and shared split caches have the potential to further
increase system performance over the base case of separate nonshared caches.
Shared split caches realize their performance increase by reducing contention
to the shared cache and eliminating contention between the private data
reference streams of individual processors.  Shared caches also increase
system performance, but must incorporate additional features to achieve this
performance increase. Contention for a small number of sets in the shared
cache requires that address tinting or set associativity be used to achieve
the higher performance.