An Area/Performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations

Peter Soderquist; Miriam Leeser

doi:10.1109/ARITH.1995.465366

Abstract

The implementations of division and square root in the FPU's of current microprocessors are based on one of two categories of algorithms. Multiplicative techniques, exemplified by the Newton-Raphson method and Goldschmidt's algorithm, share functionality with the floating-point multiplier. Subtractive methods, such as the many variations of radix-4 SRT, generally use dedicated, parallel hardware. These different approaches give rise to the distinct area and performance characteristics which are explored in this paper. Area comparisons are derived from measurements of commercial and academic hardware implementations. Representative divide/square root implementations are paired with typical add-multiply structures and simulated, using data from current microprocessor and arithmetic coprocessor designs, to obtain performance estimates. The results suggest that subtractive implementations offer a superior balance of area and performance, and stand to benefit most decisively from improvements in technology and growing transistor budgets due to their parallel operation. Multiplicative methods lend themselves best to situations where hardware re-use is mandated due to area or architectural constraints.

An Area/Performance Comparison of Subtractive and Multiplicative Divide/Square Root Implementations

Authors

Abstract