EUROMICRO Conference
Download PDF

Abstract

Abstract: Multithreaded processors use a fast context switch to bridge latencies caused by memory accesses or by synchronization operations. In the block-multithreaded processor-called Rhamma-load/store, synchronization and execution operations of different threads of control are executed simultaneously by appropriate functional units. A fast context switch is performed, whenever a functional unit comes across an operation destined for another unit. Switching contexts on each load/store instruction sequence allows a much faster context switch in the execution unit than previously published designs do. The results show the potential of multithreading to spare expensive off-chip cache in a workstation environment. The load/store unit proves as the principal bottleneck. In particular the memory cycle time is performance critical. We show that multithreaded processors profit more than conventional RISC processors by a shorter memory cycle time.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!