12:15 February 12, 1997
Communication latency is an important component of performance and scalability for parallel applications. This talk investigates the performance of several mechanisms designed to reduce communication latency in scalable shared memory systems.
The basis of communication in shared memory machines is the reading and writing of memory locations, for both data transfer and synchronization. When a invalidate-based cache coherence is used, shared data resides in the producer's cache or memory, and the consumers of the data must fetch it. Prefetch may be used to effectively hide some of the latency associated with fetching data, but there are other mechanisms which aim to transfer the data directly into the consumers' caches. Such mechanisms include line-based locks (e.g., QOLB), deliver, update-based protocols, and message passing mechanisms (e.g., StreamLine).
We use simulation to examine the performance of these mechanisms for various applications and with varying system parameters.