Digital Design Solutions
for communications and signal processing
Radix-2 CGS FFT Core
This module implements a radix-2 constant geometry structure FFT processor with configurable FFT lengths and I/O widths. The Figure 1 below shows the signal flow graph for an 8-point FFT. The RTL design exploits the constant geometry structure to maximise logic sharing, however, at the cost of reduced throughput.
The throughput of the FFT is given by:
Throughput, T = fCLK / (log2(N) + 2) samples/sec
where:
fCLK is the clock frequency
and N is the FFT length
For example, the throughput for a 1024-point FFT clocked at 100MHz is
8.33Msps. One of the main disadvantages of this structure is that it requires
twice as much storage compared to in-place algorithms such as the
Radix-2 decimation in frequency FFT core.
The Table 1 below shows resource utilisations for 8-bit input, full-precision
output (i.e. unscaled) FFTs with 10-bit twiddle-factors implemented on a
Xilinx Spartan-6 FPGA.
References:
- James W. Cooley and John W. Tukey, "An Algorithm for the Machine Calculation of Complex Fourier Series", Math, Comp., vol. 19, pp. 297-301, Apr. 1965.
XionLogic Open-source FFT Processor Library
- Download Verilog-2001 Source Code
- View table of contents
- View installation notes
- View change history
See also:
Radix-2 CGS FFT Processor Resource Utilisation Guide | |||
Design Parameter | Value | Unit | |
Target FPGA | Xilinx Spartan-6 LX16-2 | - | |
Clock frequency constraint | 110 | MHz | |
Twiddle-factor width | 10 | Bits | |
Input data width | 8 | Bits | |
Resource Type | 256-point FFT | 1024-point FFT | 4096-point FFT |
Throughput @ 100MHz | 10Msps | 8.333Msps | 7.142Msps |
Slices | 107 | 132 | 138 |
Flip-Flops | 343 | 392 | 460 |
LUTs | 277 | 331 | 368 |
B-RAMs | 2.5 | 6 | 24 |
DSP48A1s | 4 | 4 | 4 |