CUDA
CUDA is a parallel computing platform and programming model developed by NVIDIA for general computing on GPUs.
The cuDNN library, a GPU-accelerated library for Deep Neural Networks, is also installed.
Documentation
- Introduction to CUDA, from the NVIDIA website
- CUDA documentation from the NVIDIA website
- Introduction to nvprof, the command line profiler, from the NVIDIA website
- CUDA-MEMCHECK, a suite of tools to diagnose functional correctness
- cuDNN documentation, from the NVIDIA website
Usage on Bridges-2
To see what versions of CUDA or cuDNN are available and if there is more than one, which is the default, along with some help, type
module spider cuda module spider cudnn
To use CUDA, include a command like this in your batch script or interactive session to load the CUDA or cuDNN module: (note ‘module load’ is case-sensitive):
module load cuda module load cudnn
Profiling your code
To profile your CUDA code, use the command line profiler nvprof
, which comes with the CUDA Toolkit. More information on nvprof
can be found on the NVIDIA web site (see Documentation above).
Common Errors
Many errors using CUDA are caused by using an outdated version. Try loading the latest version of CUDA with
module load cuda
rather than specifying a specific module.
An error like:
The application being profiled received a signal
can indicate that the code being profiled is incorrect. Some things to check are:
- Memory errors. Try cuda-memcheck. More information on CUDA-MEMCHECK can be found on the NVIDIA web site (see Documentation above).
- DeviceReset or exit calls; these can hinder writing profile logs