Digital Design Solutions

for communications and signal processing

Radix-2 CGS FFT Core

This module implements a radix-2 constant geometry structure FFT processor with configurable FFT lengths and I/O widths. The Figure 1 below shows the signal flow graph for an 8-point FFT. The RTL design exploits the constant geometry structure to maximise logic sharing, however, at the cost of reduced throughput.

The throughput of the FFT is given by:
  Throughput, T = fCLK / (log2(N) + 2) samples/sec
  where:
      fCLK is the clock frequency and N is the FFT length
For example, the throughput for a 1024-point FFT clocked at 100MHz is 8.33Msps. One of the main disadvantages of this structure is that it requires twice as much storage compared to in-place algorithms such as the Radix-2 decimation in frequency FFT core. The Table 1 below shows resource utilisations for 8-bit input, full-precision output (i.e. unscaled) FFTs with 10-bit twiddle-factors implemented on a Xilinx Spartan-6 FPGA.

References:

  1. James W. Cooley and John W. Tukey, "An Algorithm for the Machine Calculation of Complex Fourier Series", Math, Comp., vol. 19, pp. 297-301, Apr. 1965.

processor

XionLogic Open-source FFT Processor Library

See also:

Signal Flow Graph for 8-point Radix-2 CGS FFT
Figure 1: Signal Flow Graph for 8-point Radix-2 CGS FFT
Radix-2 CGS FFT Processor Resource Utilisation Guide
Design Parameter Value Unit
Target FPGA Xilinx Spartan-6 LX16-2 -
Clock frequency constraint 110 MHz
Twiddle-factor width 10 Bits
Input data width 8 Bits
Resource Type 256-point FFT 1024-point FFT 4096-point FFT
Throughput @ 100MHz 10Msps 8.333Msps 7.142Msps
Slices 107 132 138
Flip-Flops 343 392 460
LUTs 277 331 368
B-RAMs 2.5 6 24
DSP48A1s 4 4 4
Table 1: Radix-2 CGS FFT Processor Resource Utilisation Guide