OpenCL Fast Fourier Transform
Eric Bainville - May 2010, updated March 2011Writing OpenCL code for single and double precision
Support for double precision floating-point type double in OpenCL kernels requires an extension. Today (AMD APP SDK 2.3), AMD does not provide a fully compliant cl_khr_fp64 extension, but provides the cl_amd_fp64 extension. The following code is used in our kernels to handle single or double precision:
#if CONFIG_USE_DOUBLE #if defined(cl_khr_fp64) // Khronos extension available? #pragma OPENCL EXTENSION cl_khr_fp64 : enable #elif defined(cl_amd_fp64) // AMD extension available? #pragma OPENCL EXTENSION cl_amd_fp64 : enable #endif // double typedef double real_t; typedef double2 real2_t; #define FFT_PI 3.14159265358979323846 #define FFT_SQRT_1_2 0.70710678118654752440 #else // float typedef float real_t; typedef float2 real2_t; #define FFT_PI 3.14159265359f #define FFT_SQRT_1_2 0.707106781187f #endif
A macro is defined by the OpenCL C compiler for each available extension, here for example cl_khr_fp64. This macro can be tested to enable the extension with #pragma OPENCL EXTENSION cl_khr_fp64 : enable. The definition of CONFIG_USE_DOUBLE is passed as compilation option to clBuildProgram.
In the kernel code, we will use the real_t, real2_t types instead of float or double, and use the FFT_... constants.
We are now ready to start experimenting OpenCL FFT kernels: Radix-2 kernel.
OpenCL FFT : Reference implementations | Top of Page | OpenCL FFT : Radix-2 kernel |