Abstract
This paper describes techniques for combining serialization and reconfiguration to produce efficient convolver designs. Several optimisation techniques, such as restructuring and pipeline morphing, are presented with an analysis of their impact on performance and resource usage. The proposed techniques do not require the basic processing element to be modified. An estimate of the performance of a serial design is given when mapped using a distributed arithmetic core onto a Xilinx Virtex FPGA. We estimate that a convolver of more than 2000 taps at 470; 000 samples per second can be implemented in one quarter of the logic resources of a Virtex XCV300 device.