Digital Design Solutions

for communications and signal processing

Radix-2 DIF FFT Core

This module implements a radix-2 decimation-in-frequency FFT processor with configurable FFT lengths and I/O widths. The Figure 1 below shows the signal flow graph for an 8-point FFT. The RTL design employs an innovative address generator to minimise control logic overhead and maximise logic sharing. Thus, achieving resource utilisations comparable to those offered by constant-geometry structures such as Radix-2 constant-geometry structure FFT core but with with half the storage requirements.

The throughput of the FFT is given by:
  Throughput, T = fCLK / (log2(N) + 2) samples/sec
  where:
      fCLK is the clock frequency and N is the FFT length
For example, the throughput for a 1024-point FFT clocked at 100MHz is 8.33Msps. The Table 1 below shows resource utilisations for 8-bit input, full-precision output (i.e. unscaled) FFTs with 10-bit twiddle-factors implemented on a Xilinx Spartan-6 FPGA.

The FFT processor design is provided as open source RTL in Verilog-2001 under BSD license.

References:

  1. James W. Cooley and John W. Tukey, "An Algorithm for the Machine Calculation of Complex Fourier Series", Math, Comp., vol. 19, pp. 297-301, Apr. 1965.

processor

XionLogic Open-source FFT Processor Library

See also:

Signal Flow Graph for 8-point Radix-2 DIF FFT
Figure 1: Signal Flow Graph for 8-point Radix-2 DIF FFT
Radix-2 DIF FFT Processor Resource Utilisation Guide
Design Parameter Value Unit
Target FPGA Xilinx Spartan-6 LX16-2 -
Clock frequency constraint 110 MHz
Twiddle-factor width 10 Bits
Input data width 8 Bits
Resource Type 256-point FFT 1024-point FFT 4096-point FFT
Throughput @ 100MHz 10Msps 8.333Msps 7.142Msps
Slices 113 137 143
Flip-Flops 379 438 516
LUTs 315 388 438
B-RAMs 2 4 16
DSP48A1s 4 4 4
Table 1: Radix-2 DIF FFT Processor Resource Utilisation Guide