EE385b Seminar, April 16, 1997, 12:15 (noon), Gates B08. EE385b Seminar http://arith.stanford.edu/ee385b_sched.spr97.html Title: Ubiquitous Parallelism for Super-linear Speedup or How to Use Subword Parallel Instructions Effectively Speaker: Ruby Lee, HP Date: April 16, 1997, 12:15 (noon) Place: CS Building, Gates B08. Note new room! Abstract: Subword Parallelism is a form of small-scale SIMD parallelism that has been embedded into the datapaths and instruction-sets of most general-purpose processor architectures. In subword parallelism, an instruction operates on multiple lower-precision subwords packed into a word-oriented datapath of typically 32 or 64 bit words. This provides very low-cost SIMD parallelism within common microprocessor instruction sets. First introduced in products for general multimedia acceleration by MAX-1 (Multimedia Acceleration eXtensions) for PA-RISC processors in Jan 1994, it is now found in Sun UltraSparc VIS, HP PA-RISC MAX-2, Intel MMX, and SGI MIPS MDMX, and to a limited extent in the Dec Alpha. Although subword parallelism is present in everyday microprocessors, its potential to truely bring ubiquitous parallelism into everyday programming depends on the programmer and the compiler. First, we need to familiarize programmers with how to structure algorithms to maximize the speedup possible with packed SIMD data and operations. Second, it is desirable for compilers to be able to recognize opportunities for subword parallelism, whether with or without formal high-level language extensions. In this talk, we describe how key multimedia kernels may be organized to use subword parallel instructions. We try to abstract out the generic parallel programming techniques that achieve not only linear, but often superlinear, speedup for the degree of embedded SIMD (subword) parallelism. We also discuss how this maps into standard compiler optimization techniques, and the areas where special multimedia features are exploited. Time permitting, we may also discuss research issues in subword arithmetic and/or language extensions. Speaker Bio: ____________ Ruby B. Lee is chief architect in the Computer Systems Organization at Hewlett-Packard responsible for multimedia architecture, as well as processor, systems and security architecture. She is also consulting professor of Electrical Engineering at Stanford. Lee joined HP as a founding member of the PA-RISC architecture team in Sept 1981. She has played a key role in the definition and architectural evolution of several generations of PA-RISC processors and systems in HP's technical workstation and business server product lines. Lee led a cross-divisional multimedia team that pioneered the introduction of products for real-time multimedia processing using software on a general-purpose processor, enhanced with a small set of Multimedia Acceleration eXtensions (MAX-1). Lee is the architect for MAX-1 and MAX-2, the multimedia extensions for the 32-bit and 64-bit PA-RISC architectures. Lee received an A.B. (distinction) degree from Cornell University, an M.S. in Computer Science and a Ph.D. in Electrical Engineering from Stanford University. She holds 12 U.S. patents, several foreign patents, and 10 patents pending, on processor architecture, pipelined designs, cache hints, branch optimizations, memory concurrency, and multimedia architecture and algorithms. Her current interests are in media processing architectures, subword parallelism programming, compiler and circuit techniques, ubiquitous parallelism, security architectures and encryption/decryption.