Class CLCapabilities
- java.lang.Object
-
- org.lwjgl.opencl.CLCapabilities
-
public class CLCapabilities extends java.lang.Object
Defines the capabilities of an OpenCL platform or device.The instance returned by
CL.createPlatformCapabilities(long)
exposes the functionality present on either the platform or any of its devices. This is unlike thePLATFORM_EXTENSIONS
string, which returns only platform functionality, supported across all platform devices.The instance returned by
CL.createDeviceCapabilities(long, org.lwjgl.opencl.CLCapabilities)
exposes only the functionality available on that particular device.
-
-
Field Summary
-
-
-
Field Detail
-
clGetPlatformIDs
public final long clGetPlatformIDs
-
clGetPlatformInfo
public final long clGetPlatformInfo
-
clGetDeviceIDs
public final long clGetDeviceIDs
-
clGetDeviceInfo
public final long clGetDeviceInfo
-
clCreateContext
public final long clCreateContext
-
clCreateContextFromType
public final long clCreateContextFromType
-
clRetainContext
public final long clRetainContext
-
clReleaseContext
public final long clReleaseContext
-
clGetContextInfo
public final long clGetContextInfo
-
clCreateCommandQueue
public final long clCreateCommandQueue
-
clRetainCommandQueue
public final long clRetainCommandQueue
-
clReleaseCommandQueue
public final long clReleaseCommandQueue
-
clGetCommandQueueInfo
public final long clGetCommandQueueInfo
-
clCreateBuffer
public final long clCreateBuffer
-
clEnqueueReadBuffer
public final long clEnqueueReadBuffer
-
clEnqueueWriteBuffer
public final long clEnqueueWriteBuffer
-
clEnqueueCopyBuffer
public final long clEnqueueCopyBuffer
-
clEnqueueMapBuffer
public final long clEnqueueMapBuffer
-
clCreateImage2D
public final long clCreateImage2D
-
clCreateImage3D
public final long clCreateImage3D
-
clGetSupportedImageFormats
public final long clGetSupportedImageFormats
-
clEnqueueReadImage
public final long clEnqueueReadImage
-
clEnqueueWriteImage
public final long clEnqueueWriteImage
-
clEnqueueCopyImage
public final long clEnqueueCopyImage
-
clEnqueueCopyImageToBuffer
public final long clEnqueueCopyImageToBuffer
-
clEnqueueCopyBufferToImage
public final long clEnqueueCopyBufferToImage
-
clEnqueueMapImage
public final long clEnqueueMapImage
-
clGetImageInfo
public final long clGetImageInfo
-
clRetainMemObject
public final long clRetainMemObject
-
clReleaseMemObject
public final long clReleaseMemObject
-
clEnqueueUnmapMemObject
public final long clEnqueueUnmapMemObject
-
clGetMemObjectInfo
public final long clGetMemObjectInfo
-
clCreateSampler
public final long clCreateSampler
-
clRetainSampler
public final long clRetainSampler
-
clReleaseSampler
public final long clReleaseSampler
-
clGetSamplerInfo
public final long clGetSamplerInfo
-
clCreateProgramWithSource
public final long clCreateProgramWithSource
-
clCreateProgramWithBinary
public final long clCreateProgramWithBinary
-
clRetainProgram
public final long clRetainProgram
-
clReleaseProgram
public final long clReleaseProgram
-
clBuildProgram
public final long clBuildProgram
-
clUnloadCompiler
public final long clUnloadCompiler
-
clGetProgramInfo
public final long clGetProgramInfo
-
clGetProgramBuildInfo
public final long clGetProgramBuildInfo
-
clCreateKernel
public final long clCreateKernel
-
clCreateKernelsInProgram
public final long clCreateKernelsInProgram
-
clRetainKernel
public final long clRetainKernel
-
clReleaseKernel
public final long clReleaseKernel
-
clSetKernelArg
public final long clSetKernelArg
-
clGetKernelInfo
public final long clGetKernelInfo
-
clGetKernelWorkGroupInfo
public final long clGetKernelWorkGroupInfo
-
clEnqueueNDRangeKernel
public final long clEnqueueNDRangeKernel
-
clEnqueueTask
public final long clEnqueueTask
-
clEnqueueNativeKernel
public final long clEnqueueNativeKernel
-
clWaitForEvents
public final long clWaitForEvents
-
clGetEventInfo
public final long clGetEventInfo
-
clRetainEvent
public final long clRetainEvent
-
clReleaseEvent
public final long clReleaseEvent
-
clEnqueueMarker
public final long clEnqueueMarker
-
clEnqueueBarrier
public final long clEnqueueBarrier
-
clEnqueueWaitForEvents
public final long clEnqueueWaitForEvents
-
clGetEventProfilingInfo
public final long clGetEventProfilingInfo
-
clFlush
public final long clFlush
-
clFinish
public final long clFinish
-
clGetExtensionFunctionAddress
public final long clGetExtensionFunctionAddress
-
clCreateFromGLBuffer
public final long clCreateFromGLBuffer
-
clCreateFromGLTexture2D
public final long clCreateFromGLTexture2D
-
clCreateFromGLTexture3D
public final long clCreateFromGLTexture3D
-
clCreateFromGLRenderbuffer
public final long clCreateFromGLRenderbuffer
-
clGetGLObjectInfo
public final long clGetGLObjectInfo
-
clGetGLTextureInfo
public final long clGetGLTextureInfo
-
clEnqueueAcquireGLObjects
public final long clEnqueueAcquireGLObjects
-
clEnqueueReleaseGLObjects
public final long clEnqueueReleaseGLObjects
-
clCreateSubBuffer
public final long clCreateSubBuffer
-
clSetMemObjectDestructorCallback
public final long clSetMemObjectDestructorCallback
-
clEnqueueReadBufferRect
public final long clEnqueueReadBufferRect
-
clEnqueueWriteBufferRect
public final long clEnqueueWriteBufferRect
-
clEnqueueCopyBufferRect
public final long clEnqueueCopyBufferRect
-
clCreateUserEvent
public final long clCreateUserEvent
-
clSetUserEventStatus
public final long clSetUserEventStatus
-
clSetEventCallback
public final long clSetEventCallback
-
clGetExtensionFunctionAddressForPlatform
public final long clGetExtensionFunctionAddressForPlatform
-
clRetainDevice
public final long clRetainDevice
-
clReleaseDevice
public final long clReleaseDevice
-
clCreateSubDevices
public final long clCreateSubDevices
-
clCreateImage
public final long clCreateImage
-
clCreateProgramWithBuiltInKernels
public final long clCreateProgramWithBuiltInKernels
-
clCompileProgram
public final long clCompileProgram
-
clLinkProgram
public final long clLinkProgram
-
clUnloadPlatformCompiler
public final long clUnloadPlatformCompiler
-
clGetKernelArgInfo
public final long clGetKernelArgInfo
-
clEnqueueFillBuffer
public final long clEnqueueFillBuffer
-
clEnqueueFillImage
public final long clEnqueueFillImage
-
clEnqueueMigrateMemObjects
public final long clEnqueueMigrateMemObjects
-
clEnqueueMarkerWithWaitList
public final long clEnqueueMarkerWithWaitList
-
clEnqueueBarrierWithWaitList
public final long clEnqueueBarrierWithWaitList
-
clCreateFromGLTexture
public final long clCreateFromGLTexture
-
clCreateCommandQueueWithProperties
public final long clCreateCommandQueueWithProperties
-
clCreatePipe
public final long clCreatePipe
-
clGetPipeInfo
public final long clGetPipeInfo
-
clSVMAlloc
public final long clSVMAlloc
-
clSVMFree
public final long clSVMFree
-
clEnqueueSVMFree
public final long clEnqueueSVMFree
-
clEnqueueSVMMemcpy
public final long clEnqueueSVMMemcpy
-
clEnqueueSVMMemFill
public final long clEnqueueSVMMemFill
-
clEnqueueSVMMap
public final long clEnqueueSVMMap
-
clEnqueueSVMUnmap
public final long clEnqueueSVMUnmap
-
clSetKernelArgSVMPointer
public final long clSetKernelArgSVMPointer
-
clSetKernelExecInfo
public final long clSetKernelExecInfo
-
clCreateSamplerWithProperties
public final long clCreateSamplerWithProperties
-
clSetDefaultDeviceCommandQueue
public final long clSetDefaultDeviceCommandQueue
-
clGetDeviceAndHostTimer
public final long clGetDeviceAndHostTimer
-
clGetHostTimer
public final long clGetHostTimer
-
clCreateProgramWithIL
public final long clCreateProgramWithIL
-
clCloneKernel
public final long clCloneKernel
-
clGetKernelSubGroupInfo
public final long clGetKernelSubGroupInfo
-
clEnqueueSVMMigrateMem
public final long clEnqueueSVMMigrateMem
-
clSetProgramReleaseCallback
public final long clSetProgramReleaseCallback
-
clSetProgramSpecializationConstant
public final long clSetProgramSpecializationConstant
-
clTrackLiveObjectsAltera
public final long clTrackLiveObjectsAltera
-
clReportLiveObjectsAltera
public final long clReportLiveObjectsAltera
-
clEnqueueWaitSignalAMD
public final long clEnqueueWaitSignalAMD
-
clEnqueueWriteSignalAMD
public final long clEnqueueWriteSignalAMD
-
clEnqueueMakeBuffersResidentAMD
public final long clEnqueueMakeBuffersResidentAMD
-
clCreateCommandQueueWithPropertiesAPPLE
public final long clCreateCommandQueueWithPropertiesAPPLE
-
clLogMessagesToSystemLogAPPLE
public final long clLogMessagesToSystemLogAPPLE
-
clLogMessagesToStdoutAPPLE
public final long clLogMessagesToStdoutAPPLE
-
clLogMessagesToStderrAPPLE
public final long clLogMessagesToStderrAPPLE
-
clGetGLContextInfoAPPLE
public final long clGetGLContextInfoAPPLE
-
clReleaseDeviceEXT
public final long clReleaseDeviceEXT
-
clRetainDeviceEXT
public final long clRetainDeviceEXT
-
clCreateSubDevicesEXT
public final long clCreateSubDevicesEXT
-
clEnqueueMigrateMemObjectEXT
public final long clEnqueueMigrateMemObjectEXT
-
clCreateAcceleratorINTEL
public final long clCreateAcceleratorINTEL
-
clRetainAcceleratorINTEL
public final long clRetainAcceleratorINTEL
-
clReleaseAcceleratorINTEL
public final long clReleaseAcceleratorINTEL
-
clGetAcceleratorInfoINTEL
public final long clGetAcceleratorInfoINTEL
-
clGetKernelSubGroupInfoKHR
public final long clGetKernelSubGroupInfoKHR
-
clGetDeviceIDsFromVA_APIMediaAdapterINTEL
public final long clGetDeviceIDsFromVA_APIMediaAdapterINTEL
-
clCreateFromVA_APIMediaSurfaceINTEL
public final long clCreateFromVA_APIMediaSurfaceINTEL
-
clEnqueueAcquireVA_APIMediaSurfacesINTEL
public final long clEnqueueAcquireVA_APIMediaSurfacesINTEL
-
clEnqueueReleaseVA_APIMediaSurfacesINTEL
public final long clEnqueueReleaseVA_APIMediaSurfacesINTEL
-
clCreateEventFromEGLSyncKHR
public final long clCreateEventFromEGLSyncKHR
-
clCreateFromEGLImageKHR
public final long clCreateFromEGLImageKHR
-
clEnqueueAcquireEGLObjectsKHR
public final long clEnqueueAcquireEGLObjectsKHR
-
clEnqueueReleaseEGLObjectsKHR
public final long clEnqueueReleaseEGLObjectsKHR
-
clCreateEventFromGLsyncKHR
public final long clCreateEventFromGLsyncKHR
-
clGetGLContextInfoKHR
public final long clGetGLContextInfoKHR
-
clTerminateContextKHR
public final long clTerminateContextKHR
-
clGetDeviceImageInfoQCOM
public final long clGetDeviceImageInfoQCOM
-
OpenCL10
public final boolean OpenCL10
When true,CL10
is supported.
-
OpenCL10GL
public final boolean OpenCL10GL
When true,CL10GL
is supported.
-
OpenCL11
public final boolean OpenCL11
When true,CL11
is supported.
-
OpenCL12
public final boolean OpenCL12
When true,CL12
is supported.
-
OpenCL12GL
public final boolean OpenCL12GL
When true,CL12GL
is supported.
-
OpenCL20
public final boolean OpenCL20
When true,CL20
is supported.
-
OpenCL21
public final boolean OpenCL21
When true,CL21
is supported.
-
OpenCL22
public final boolean OpenCL22
When true,CL22
is supported.
-
cl_altera_compiler_mode
public final boolean cl_altera_compiler_mode
When true,ALTERACompilerMode
is supported.
-
cl_altera_device_temperature
public final boolean cl_altera_device_temperature
When true,ALTERADeviceTemperature
is supported.
-
cl_altera_live_object_tracking
public final boolean cl_altera_live_object_tracking
When true,ALTERALiveObjectTracking
is supported.
-
cl_amd_bus_addressable_memory
public final boolean cl_amd_bus_addressable_memory
When true,AMDBusAddressableMemory
is supported.
-
cl_amd_compile_options
public final boolean cl_amd_compile_options
When true, the amd_compile_options extension is supported.This extension adds the following options, which are not part of the OpenCL specification:
- -g – This is an experimental feature that lets you use the GNU project debugger, GDB, to debug kernels on x86 CPUs running Linux or cygwin/minGW under Windows. This option does not affect the default optimization of the OpenCL code.
- -O0 – Specifies to the compiler not to optimize. This is equivalent to the OpenCL standard option -cl-opt-disable.
- -f[no-]bin-source – Does [not] generate OpenCL source in the .source section. By default, the source is NOT generated.
- -f[no-]bin-llvmir – Does [not] generate LLVM IR in the .llvmir section. By default, LLVM IR IS generated.
- -f[no-]bin-amdil – Does [not] generate AMD IL in the .amdil section. By Default, AMD IL is NOT generated.
- -f[no-]bin-exe – Does [not] generate the executable (ISA) in .text section. By default, the executable IS generated.
- -f[no-]bin-hsail – Does [not] generate HSAIL/BRIG in the binary. By default, HSA IL/BRIG is NOT generated.
To avoid source changes, there are two environment variables that can be used to change CL options during the runtime:
- AMD_OCL_BUILD_OPTIONS – Overrides the CL options specified in
BuildProgram
. - AMD_OCL_BUILD_OPTIONS_APPEND – Appends options to the options specified in
BuildProgram
.
-
cl_amd_device_attribute_query
public final boolean cl_amd_device_attribute_query
When true,AMDDeviceAttributeQuery
is supported.
-
cl_amd_device_board_name
public final boolean cl_amd_device_board_name
When true,AMDDeviceBoardName
is supported.
-
cl_amd_device_persistent_memory
public final boolean cl_amd_device_persistent_memory
When true,AMDDevicePersistentMemory
is supported.
-
cl_amd_device_profiling_timer_offset
public final boolean cl_amd_device_profiling_timer_offset
When true,AMDDeviceProfilingTimerOffset
is supported.
-
cl_amd_device_topology
public final boolean cl_amd_device_topology
When true,AMDDeviceTopology
is supported.
-
cl_amd_event_callback
public final boolean cl_amd_event_callback
When true, the amd_event_callback extension is supported.This extension provides the ability to register event callbacks for states other than
COMPLETE
. The full set of event states are allowed:QUEUED
,SUBMITTED
, andRUNNING
.
-
cl_amd_fp64
public final boolean cl_amd_fp64
When true, the amd_fp64 extension is supported.This extension provides a subset of the functionality of that provided by the cl_khr_fp64 extension. When enabled, the compiler recognizes the double scalar and vector types, compiles expressions involving those types, and accepts calls to all builtin functions enabled by the cl_khr_fp64 extension. However, this extension does not guarantee that all cl_khr_fp64 built in functions are implemented and does not guarantee that the built in functions that have been implemented would be considered conformant to the cl_khr_fp64 extension.
-
cl_amd_media_ops
public final boolean cl_amd_media_ops
When true, the amd_media_ops extension is supported.The directive when enabled adds the following built-in functions to the OpenCL language.
Note: typen denote opencl scalar type {n = 1} and vector types {n = 4, 8, 16}. Build-in Function uint amd_pack(float4 src) Description dst = ((((uint)src.s0) & 0xff) ) + ((((uint)src.s1) & 0xff) << 8) + ((((uint)src.s2) & 0xff) << 16) + ((((uint)src.s3) & 0xff) << 24) Build-in Function floatn amd_unpack3(unitn src) Description dst.s0 = (float)((src.s0 >> 24) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack2 (unitn src) Description dst.s0 = (float)((src.s0 >> 16) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack1 (unitn src) Description dst.s0 = (float)((src.s0 >> 8) & 0xff) similar operation applied to other components of the vectors Build-in Function floatn amd_unpack0 (unitn src) Description dst.s0 = (float)(src.s0 & 0xff) similar operation applied to other components of the vectors Build-in Function uintn amd_bitalign (uintn src0, uintn src1, uintn src2) Description dst.s0 = (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> (src2.s0 & 31)) similar operation applied to other components of the vectors. Build-in Function uintn amd_bytealign (uintn src0, uintn src1, uintn src2) Description dst.s0 = (uint) (((((long)src0.s0) << 32) | (long)src1.s0) >> ((src2.s0 & 3)*8)) similar operation applied to other components of the vectors Build-in Function uintn amd_lerp (uintn src0, uintn src1, uintn src2) Description dst.s0 = (((((src0.s0 >> 0) & 0xff) + ((src1.s0 >> 0) & 0xff) + ((src2.s0 >> 0) & 1)) >> 1) << 0) + (((((src0.s0 >> 8) & 0xff) + ((src1.s0 >> 8) & 0xff) + ((src2.s0 >> 8) & 1)) >> 1) << 8) + (((((src0.s0 >> 16) & 0xff) + ((src1.s0 >> 16) & 0xff) + ((src2.s0 >> 16) & 1)) >> 1) << 16) + (((((src0.s0 >> 24) & 0xff) + ((src1.s0 >> 24) & 0xff) + ((src2.s0 >> 24) & 1)) >> 1) << 24); similar operation applied to other components of the vectors Build-in Function uintn amd_sad (uintn src0, uintn src1, uintn src2) Description dst.s0 = src2.s0 + abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) + abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) + abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)); similar operation applied to other components of the vectors Build-in Function uintn amd_sadhi (uintn src0, uintn src1n, uintn src2) Description dst.s0 = src2.s0 + (abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) << 16) + (abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) << 16) + (abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) << 16) + (abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) << 16); similar operation applied to other components of the vectors Build-in Function uint amd_sad4(uint4 src0, uint4 src1, uint src2) Description dst = src2 + abs(((src0.s0 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s0 >> 8) & 0xff) - ((src1.s0 >> 8) & 0xff)) + abs(((src0.s0 >> 16) & 0xff) - ((src1.s0 >> 16) & 0xff)) + abs(((src0.s0 >> 24) & 0xff) - ((src1.s0 >> 24) & 0xff)) + abs(((src0.s1 >> 0) & 0xff) - ((src1.s0 >> 0) & 0xff)) + abs(((src0.s1 >> 8) & 0xff) - ((src1.s1 >> 8) & 0xff)) + abs(((src0.s1 >> 16) & 0xff) - ((src1.s1 >> 16) & 0xff)) + abs(((src0.s1 >> 24) & 0xff) - ((src1.s1 >> 24) & 0xff)) + abs(((src0.s2 >> 0) & 0xff) - ((src1.s2 >> 0) & 0xff)) + abs(((src0.s2 >> 8) & 0xff) - ((src1.s2 >> 8) & 0xff)) + abs(((src0.s2 >> 16) & 0xff) - ((src1.s2 >> 16) & 0xff)) + abs(((src0.s2 >> 24) & 0xff) - ((src1.s2 >> 24) & 0xff)) + abs(((src0.s3 >> 0) & 0xff) - ((src1.s3 >> 0) & 0xff)) + abs(((src0.s3 >> 8) & 0xff) - ((src1.s3 >> 8) & 0xff)) + abs(((src0.s3 >> 16) & 0xff) - ((src1.s3 >> 16) & 0xff)) + abs(((src0.s3 >> 24) & 0xff) - ((src1.s3 >> 24) & 0xff));
-
cl_amd_media_ops2
public final boolean cl_amd_media_ops2
When true, the amd_media_ops2 extension is supported.The directive when enabled adds the following built-in functions to the OpenCL language.
Note: typen denote open scalar type { n = 1 } and vector types { n = 2, 4, 8, 16 }. Build-in Function uintn amd_msad (uintn src0, uintn src1, uintn src2) Description uchar4 src0u8 = as_uchar4(src0.s0); uchar4 src1u8 = as_uchar4(src1.s0); dst.s0 = src2.s0 + ((src1u8.s0 == 0) ? 0 : abs(src0u8.s0 - src1u8.s0)) + ((src1u8.s1 == 0) ? 0 : abs(src0u8.s1 - src1u8.s1)) + ((src1u8.s2 == 0) ? 0 : abs(src0u8.s2 - src1u8.s2)) + ((src1u8.s3 == 0) ? 0 : abs(src0u8.s3 - src1u8.s3)); similar operation applied to other components of the vectors Build-in Function ulongn amd_qsad (ulongn src0, uintn src1, ulongn src2) Description uchar8 src0u8 = as_uchar8(src0.s0); ushort4 src2u16 = as_ushort4(src2.s0); ushort4 dstu16; dstu16.s0 = amd_sad(as_uint(src0u8.s0123), src1.s0, src2u16.s0); dstu16.s1 = amd_sad(as_uint(src0u8.s1234), src1.s0, src2u16.s1); dstu16.s2 = amd_sad(as_uint(src0u8.s2345), src1.s0, src2u16.s2); dstu16.s3 = amd_sad(as_uint(src0u8.s3456), src1.s0, src2u16.s3); dst.s0 = as_uint2(dstu16); similar operation applied to other components of the vectors Build-in Function ulongn amd_mqsad (ulongn src0, uintn src1, ulongn src2) Description uchar8 src0u8 = as_uchar8(src0.s0); ushort4 src2u16 = as_ushort4(src2.s0); ushort4 dstu16; dstu16.s0 = amd_msad(as_uint(src0u8.s0123), src1.s0, src2u16.s0); dstu16.s1 = amd_msad(as_uint(src0u8.s1234), src1.s0, src2u16.s1); dstu16.s2 = amd_msad(as_uint(src0u8.s2345), src1.s0, src2u16.s2); dstu16.s3 = amd_msad(as_uint(src0u8.s3456), src1.s0, src2u16.s3); dst.s0 = as_uint2(dstu16); similar operation applied to other components of the vectors Build-in Function uintn amd_sadw (uintn src0, uintn src1, uintn src2) Description ushort2 src0u16 = as_ushort2(src0.s0); ushort2 src1u16 = as_ushort2(src1.s0); dst.s0 = src2.s0 + abs(src0u16.s0 - src1u16.s0) + abs(src0u16.s1 - src1u16.s1); similar operation applied to other components of the vectors Build-in Function uintn amd_sadd (uintn src0, uintn src1, uintn src2) Description dst.s0 = src2.s0 + abs(src0.s0 - src1.s0); similar operation applied to other components of the vectors Built-in Function: uintn amd_bfm (uintn src0, uintn src1) Description dst.s0 = ((1 << (src0.s0 & 0x1f)) - 1) << (src1.s0 & 0x1f); similar operation applied to other components of the vectors Built-in Function: uintn amd_bfe (uintn src0, uintn src1, uintn src2) Description NOTE: operator >> below represent logical right shift offset = src1.s0 & 31; width = src2.s0 & 31; if width = 0 dst.s0 = 0; else if (offset + width) < 32 dst.s0 = (src0.s0 << (32 - offset - width)) >> (32 - width); else dst.s0 = src0.s0 >> offset; similar operation applied to other components of the vectors Built-in Function: intn amd_bfe (intn src0, uintn src1, uintn src2) Description NOTE: operator >> below represent arithmetic right shift offset = src1.s0 & 31; width = src2.s0 & 31; if width = 0 dst.s0 = 0; else if (offset + width) < 32 dst.s0 = src0.s0 << (32-offset-width) >> 32-width; else dst.s0 = src0.s0 >> offset; similar operation applied to other components of the vectors Built-in Function: intn amd_median3 (intn src0, intn src1, intn src2) uintn amd_median3 (uintn src0, uintn src1, uintn src2) floatn amd_median3 (floatn src0, floatn src1, floattn src2) Description returns median of src0, src1, and src2 Built-in Function: intn amd_min3 (intn src0, intn src1, intn src2) uintn amd_min3 (uintn src0, uintn src1, uintn src2) floatn amd_min3 (floatn src0, floatn src1, floattn src2) Description returns min of src0, src1, and src2 Built-in Function: intn amd_max3 (intn src0, intn src1, intn src2) uintn amd_max3 (uintn src0, uintn src1, uintn src2) floatn amd_max3 (floatn src0, floatn src1, floattn src2) Description returns max of src0, src1, and src2
-
cl_amd_offline_devices
public final boolean cl_amd_offline_devices
When true,AMDOfflineDevices
is supported.
-
cl_amd_popcnt
public final boolean cl_amd_popcnt
When true, the amd_popcnt extension is supported.This extension introduces a “population count” function called popcnt. This extension was taken into core OpenCL 1.2, and the function was renamed popcount. The core 1.2 popcount function is identical to the AMD extension popcnt function.
-
cl_amd_predefined_macros
public final boolean cl_amd_predefined_macros
When true, the amd_predefined_macros extension is supported.The following macros are predefined when compiling OpenCL™ C kernels. These macros are defined automatically based on the device for which the code is being compiled.
GPU devices
- __Barts__
- __BeaverCreek__
- __Bheem__
- __Bonaire__
- __Caicos__
- __Capeverde__
- __Carrizo__
- __Cayman__
- __Cedar__
- __Cypress__
- __Devastator__
- __Hainan__
- __Iceland__
- __Juniper__
- __Kalindi__
- __Kauai__
- __Lombok__
- __Loveland__
- __Mullins__
- __Oland__
- __Pitcairn__
- __RV710__
- __RV730__
- __RV740__
- __RV770__
- __RV790__
- __Redwood__
- __Scrapper__
- __Spectre__
- __Spooky__
- __Tahiti__
- __Tonga__
- __Turks__
- __WinterPark__
- __GPU__
CPU devices
- __CPU__
- __X86__
- __X86_64__
Note that __GPU__ or __CPU__ are predefined whenever a GPU or CPU device is the compilation target.
-
cl_amd_printf
public final boolean cl_amd_printf
When true, the amd_printf extension is supported.This extension adds the built-in function
printf(__constant char * restrict format, \u2026);
This function writes output to the stdout stream associated with the host application. The format string is a character sequence that:
- is null-terminated and composed of zero and more directives,
- ordinary characters (i.e. not %), which are copied directly to the output stream unchanged, and
- conversion specifications, each of which can result in fetching zero or more arguments, converting them, and then writing the final result to the output stream.
The format string must be resolvable at compile time; thus, it cannot be dynamically created by the executing program. (Note that the use of variadic arguments in the built-in printf does not imply its use in other builtins; more importantly, it is not valid to use printf in user-defined functions or kernels.)
The OpenCL C printf closely matches the definition found as part of the C99 standard. Note that conversions introduced in the format string with % are supported with the following guidelines:
- A 32-bit floating point argument is not converted to a 64-bit double, unless the extension cl_khr_fp64 is supported and enabled. This includes the double variants if cl_khr_fp64 is supported and defined in the corresponding compilation unit.
- 64-bit integer types can be printed using %ld / %lx / %lu.
- %lld / %llx / %llu are not supported and reserved for 128-bit integer types (long long).
- All OpenCL vector types can be explicitly passed and printed using the modifier vn, where n can be 2, 3, 4, 8, or 16. This modifier appears before the original conversion specifier for the vector’s component type (for example, to print a float4 %v4f). Since vn is a conversion specifier, it is valid to apply optional flags, such as field width and precision, just as it is when printing the component types. Since a vector is an aggregate type, the comma separator is used between the components: 0:1, … , n-2:n-1.
-
cl_amd_vec3
public final boolean cl_amd_vec3
When true, the amd_vec3 extension is supported.This extension adds support for vectors with three elements: float3, short3, char3, etc. This data type was added to OpenCL 1.1 as a core feature.
-
cl_APPLE_biased_fixed_point_image_formats
public final boolean cl_APPLE_biased_fixed_point_image_formats
When true,APPLEBiasedFixedPointImageFormats
is supported.
-
cl_APPLE_command_queue_priority
public final boolean cl_APPLE_command_queue_priority
When true,APPLECommandQueuePriority
is supported.
-
cl_APPLE_command_queue_select_compute_units
public final boolean cl_APPLE_command_queue_select_compute_units
When true,APPLECommandQueueSelectComputeUnits
is supported.
-
cl_APPLE_ContextLoggingFunctions
public final boolean cl_APPLE_ContextLoggingFunctions
When true,APPLEContextLoggingFunctions
is supported.
-
cl_APPLE_fixed_alpha_channel_orders
public final boolean cl_APPLE_fixed_alpha_channel_orders
When true,APPLEFixedAlphaChannelOrders
is supported.
-
cl_APPLE_fp64_basic_ops
public final boolean cl_APPLE_fp64_basic_ops
When true,APPLE_fp64_basic_ops
is supported.
-
cl_APPLE_gl_sharing
public final boolean cl_APPLE_gl_sharing
When true,APPLEGLSharing
is supported.
-
cl_APPLE_query_kernel_names
public final boolean cl_APPLE_query_kernel_names
When true,APPLEQueryKernelNames
is supported.
-
cl_arm_core_id
public final boolean cl_arm_core_id
When true, the arm_core_id extension is supported.This extension provides a built-in function (
uint arm_get_core_id( void )
) which returns a unique ID for the compute unit that a work-group is running on. This value is uniform for a work-group.This value can be used for a core-specific cache or atomic pool where the storage is required to be in global memory and persistent (but not ordered) between work-groups. This does not provide any additional ordering on top of the existing guarantees between workgroups, nor does it provide any guarantee of concurrent execution.
The IDs for the compute units may not be consecutive and applications must make sure they allocate enough memory to accommodate all the compute units present on the device. A device info query allows the application to know the IDs associated with the compute units on a given device.
-
cl_arm_printf
public final boolean cl_arm_printf
When true,ARMPrintf
is supported.
-
cl_ext_atomic_counters_32
public final boolean cl_ext_atomic_counters_32
When true,EXTAtomicCounters32
is supported.
-
cl_ext_atomic_counters_64
public final boolean cl_ext_atomic_counters_64
When true,EXTAtomicCounters64
is supported.
-
cl_ext_device_fission
public final boolean cl_ext_device_fission
When true,EXTDeviceFission
is supported.
-
cl_ext_migrate_memobject
public final boolean cl_ext_migrate_memobject
When true,EXTMigrateMemobject
is supported.
-
cl_intel_accelerator
public final boolean cl_intel_accelerator
When true,INTELAccelerator
is supported.
-
cl_intel_advanced_motion_estimation
public final boolean cl_intel_advanced_motion_estimation
When true,INTELAdvancedMotionEstimation
is supported.
-
cl_intel_device_partition_by_names
public final boolean cl_intel_device_partition_by_names
When true,INTELDevicePartitionByNames
is supported.
-
cl_intel_device_side_avc_motion_estimation
public final boolean cl_intel_device_side_avc_motion_estimation
When true,INTELDeviceSideAVCMotionEstimation
is supported.
-
cl_intel_driver_diagnostics
public final boolean cl_intel_driver_diagnostics
When true,INTELDriverDiagnostics
is supported.
-
cl_intel_egl_image_yuv
public final boolean cl_intel_egl_image_yuv
When true,INTELEGLImageYUV
is supported.
-
cl_intel_media_block_io
public final boolean cl_intel_media_block_io
This extension augments the block read/write functionality available in the Intel vendor extensionsintel_subgroups
and intel_media_block_io by the specification of additional built-in functions to facilitate the reading and writing of flexible 2D regions from images. This API allows for the explicit specification of the width and height of the image regions.While not required, this extension is most useful when the subgroup size is known at compile-time. The primary use case for this extension is to support the reading of the edge texels (or image elements) of neighboring macro-blocks as described in the Intel vendor extension
intel_device_side_avc_motion_estimation
. When using the built-in functions fromcl_intel_device_ side_avc_motion_estimation
the subgroup size is implicitly fixed to 16. In other use cases the subgroup size may be fixed using theintel_required_subgroup_size
extension, if needed.
-
cl_intel_motion_estimation
public final boolean cl_intel_motion_estimation
When true,INTELMotionEstimation
is supported.
-
cl_intel_packed_yuv
public final boolean cl_intel_packed_yuv
When true,INTELPackedYUV
is supported.
-
cl_intel_planar_yuv
public final boolean cl_intel_planar_yuv
When true,INTELPlanarYUV
is supported.
-
cl_intel_printf
public final boolean cl_intel_printf
When true,intel_printf
is supported.
-
cl_intel_required_subgroup_size
public final boolean cl_intel_required_subgroup_size
When true,INTELRequiredSubgroupSize
is supported.
-
cl_intel_simultaneous_sharing
public final boolean cl_intel_simultaneous_sharing
When true,INTELSimultaneousSharing
is supported.
-
cl_intel_subgroups
public final boolean cl_intel_subgroups
When true,INTELSubgroups
is supported.
-
cl_intel_subgroups_short
public final boolean cl_intel_subgroups_short
The goal of this extension is to allow programmers to improve the performance of applications operating on 16-bit data types by extending the subgroup functions described in theintel_subgroups
extension to support 16-bit integer data types (shorts and ushorts). Specifically, the extension:- Extends the subgroup broadcast function to allow 16-bit integer values to be broadcast from one work item to all other work items in the subgroup.
- Extends the subgroup scan and reduction functions to operate on 16-bit integer data types.
- Extends the Intel subgroup shuffle functions to allow arbitrarily exchanging 16-bit integer values among work items in the subgroup.
- Extends the Intel subgroup block read and write functions to allow reading and writing 16-bit integer data from images and buffers.
Requires
OpenCL 1.2
andintel_subgroups
-
cl_intel_thread_local_exec
public final boolean cl_intel_thread_local_exec
When true,INTELThreadLocalExec
is supported.
-
cl_intel_va_api_media_sharing
public final boolean cl_intel_va_api_media_sharing
When true,INTELVAAPIMediaSharing
is supported.
-
cl_khr_3d_image_writes
public final boolean cl_khr_3d_image_writes
When true, the khr_3d_image_writes extension is supported.This extension adds support for kernel writes to 3D images.
-
cl_khr_byte_addressable_store
public final boolean cl_khr_byte_addressable_store
When true, the khr_byte_addressable_store extension is supported.This extension eliminates the restriction of not allowing writes to a pointer (or array elements) of types less than 32-bit wide in kernel program.
-
cl_khr_depth_images
public final boolean cl_khr_depth_images
When true,KHRDepthImages
is supported.
-
cl_khr_device_enqueue_local_arg_types
public final boolean cl_khr_device_enqueue_local_arg_types
When true, the khr_device_enqueue_local_arg_types extension is supported.This extension allows arguments to blocks passed to enqueue_kernel functions to be declared as a pointer to any type (built-in or user-defined) in local memory instead of just
local void *
.
-
cl_khr_egl_event
public final boolean cl_khr_egl_event
When true,KHREGLEvent
is supported.
-
cl_khr_egl_image
public final boolean cl_khr_egl_image
When true,KHREGLImage
is supported.
-
cl_khr_fp16
public final boolean cl_khr_fp16
When true,KHRFP16
is supported.
-
cl_khr_fp64
public final boolean cl_khr_fp64
When true,KHRFP64
is supported.
-
cl_khr_gl_depth_images
public final boolean cl_khr_gl_depth_images
When true,KHRGLDepthImages
is supported.
-
cl_khr_gl_event
public final boolean cl_khr_gl_event
When true,KHRGLEvent
is supported.
-
cl_khr_gl_msaa_sharing
public final boolean cl_khr_gl_msaa_sharing
When true,KHRGLMSAASharing
is supported.
-
cl_khr_gl_sharing
public final boolean cl_khr_gl_sharing
When true,KHRGLSharing
is supported.
-
cl_khr_global_int32_base_atomics
public final boolean cl_khr_global_int32_base_atomics
When true, the khr_global_int32_base_atomics extension is supported.This extension adds basic atomic operations on 32-bit integers in global memory.
-
cl_khr_global_int32_extended_atomics
public final boolean cl_khr_global_int32_extended_atomics
When true, the khr_global_int32_extended_atomics extension is supported.This extension adds extended atomic operations on 32-bit integers in global memory.
-
cl_khr_icd
public final boolean cl_khr_icd
When true,KHRICD
is supported.
-
cl_khr_image2d_from_buffer
public final boolean cl_khr_image2d_from_buffer
When true,KHRImage2DFromBuffer
is supported.
-
cl_khr_initialize_memory
public final boolean cl_khr_initialize_memory
When true,KHRInitializeMemory
is supported.
-
cl_khr_int64_base_atomics
public final boolean cl_khr_int64_base_atomics
When true, the khr_int64_base_atomics extension is supported.This extension adds basic atomic operations on 64-bit integers in both global and local memory.
-
cl_khr_int64_extended_atomics
public final boolean cl_khr_int64_extended_atomics
When true, the khr_int64_extended_atomics extension is supported.This extension adds extended atomic operations on 64-bit integers in both global and local memory.
-
cl_khr_local_int32_base_atomics
public final boolean cl_khr_local_int32_base_atomics
When true, the khr_local_int32_base_atomics extension is supported.This extension adds basic atomic operations on 32-bit integers in local memory.
-
cl_khr_local_int32_extended_atomics
public final boolean cl_khr_local_int32_extended_atomics
When true, the khr_local_int32_extended_atomics extension is supported.This extension adds extended atomic operations on 32-bit integers in local memory.
-
cl_khr_mipmap_image
public final boolean cl_khr_mipmap_image
When true,KHRMipmapImage
is supported.
-
cl_khr_mipmap_image_writes
public final boolean cl_khr_mipmap_image_writes
When true, the khr_mipmap_image_writes extension is supported.This extension adds built-in functions that can be used to write a mip-mapped image in an OpenCL C program.
-
cl_khr_priority_hints
public final boolean cl_khr_priority_hints
When true,KHRPriorityHints
is supported.
-
cl_khr_select_fprounding_mode
public final boolean cl_khr_select_fprounding_mode
When true, the khr_select_fprounding_mode extension is supported.This extension adds support for specifying the rounding mode for an instruction or group of instructions in the program source.
The appropriate rounding mode can be specified using
#pragma OPENCL SELECT_ROUNDING_MODE
rounding-mode in the program source.The
#pragma OPENCL SELECT_ROUNDING_MODE
sets the rounding mode for all instructions that operate on floating-point types (scalar or vector types) or produce floating-point values that follow this pragma in the program source until the next#pragma OPENCL SELECT_ROUNDING_MODE
is encountered. Note that the rounding mode specified for a block of code is known at compile time. Except where otherwise documented, the callee functions do not inherit the rounding mode of the caller function.If this extension is enabled, the
__ROUNDING_MODE__
preprocessor symbol shall be defined to be one of the following according to the current rounding mode:#define __ROUNDING_MODE__ rte #define __ROUNDING_MODE__ rtz #define __ROUNDING_MODE__ rtp #define __ROUNDING_MODE__ rtz
The default rounding mode is round to nearest even. The built-in math functions, the common functions, and the geometric functions are implemented with the round to nearest even rounding mode.
Various built-in conversions and the vstore_half and vstorea_halfn built-in functions that do not specify a rounding mode inherit the current rounding mode. Conversions from floating-point to integer type always use rtz mode, except where the user specifically asks for another rounding mode.
Notes The above four rounding modes are defined by IEEE 754. Floating-point calculations may be carried out internally with extra precision and then rounded to fit into the destination type. Round to nearest even is currently the only rounding mode required by the OpenCL specification and is therefore the default rounding mode. In addition, only static selection of rounding mode is supported. Dynamically reconfiguring the rounding modes as specified by the IEEE 754 spec is not a requirement.
-
cl_khr_spir
public final boolean cl_khr_spir
When true,KHRSPIR
is supported.
-
cl_khr_subgroup_named_barrier
public final boolean cl_khr_subgroup_named_barrier
When true,KHRSubgroupNamedBarrier
is supported.
-
cl_khr_terminate_context
public final boolean cl_khr_terminate_context
When true,KHRTerminateContext
is supported.
-
cl_khr_throttle_hints
public final boolean cl_khr_throttle_hints
When true,KHRThrottleHints
is supported.
-
cl_nv_compiler_options
public final boolean cl_nv_compiler_options
When true, the nv_compiler_options extension is supported.This extension allows the programmer to pass options to the PTX assembler allowing greater control over code generation.
-cl-nv-maxrregcount <N> Passed on to ptxas as --maxrregcount <N> N is a positive integer. Specify the maximum number of registers that GPU functions can use. Until a function-specific limit, a higher value will generally increase the performance of individual GPU threads that execute this function. However, because thread registers are allocated from a global register pool on each GPU, a higher value of this option will also reduce the maximum thread block size, thereby reducing the amount of thread parallelism. Hence, a good maxrregcount value is the result of a trade-off. If this option is not specified, then no maximum is assumed. Otherwise the specified value will be rounded to the next multiple of 4 registers until the GPU specific maximum of 128 registers. -cl-nv-opt-level <N> Passed on to ptxas as --opt-level <N> N is a positive integer, or 0 (no optimization). Specify optimization level. Default value: 3. -cl-nv-verbose Passed on to ptxas as --verbose Enable verbose mode. Output will be reported in the build log (accessible through the callback parameter to clBuildProgram).
-
cl_nv_device_attribute_query
public final boolean cl_nv_device_attribute_query
When true,NVDeviceAttributeQuery
is supported.
-
cl_nv_pragma_unroll
public final boolean cl_nv_pragma_unroll
When true, the nv_pragma_unroll extension is supported.Overview
This extension extends the OpenCL C language with a hint that allows loops to be unrolled. This pragma must be used for a loop and can be used to specify full unrolling or partial unrolling by a certain amount. This is a hint and the compiler may ignore this pragma for any reason.
Goals
The principal goal of the pragma unroll is to improve the performance of loops via unrolling. Typically this enables other optimizations or improves instruction level parallelism of a thread.
Details
A user may specify that a loop in the source program be unrolled. This is done via a pragma. The syntax of this pragma is as follows
#pragma unroll [unroll-factor]
The pragma unroll may optionally specify an unroll factor. The pragma must be placed immediately before the loop and only applies to that loop.
If unroll factor is not specified then the compiler will try to do complete or full unrolling of the loop. If a loop unroll factor is specified the compiler will perform partial loop unrolling. The loop factor, if specified, must be a compile time non negative integer constant.
A loop unroll factor of 1 means that the compiler should not unroll the loop.
A complete unroll specification has no effect if the trip count of the loop is not compile-time computable.
-
cl_qcom_ext_host_ptr
public final boolean cl_qcom_ext_host_ptr
When true,QCOMEXTHostPtr
is supported.
-
cl_qcom_ext_host_ptr_iocoherent
public final boolean cl_qcom_ext_host_ptr_iocoherent
When true,QCOMEXTHostPtrIOCoherent
is supported.
-
-