VTune
VTune is a software performance analysis tool provided by Intel for users developing serial, multithreaded or MPI applications. VTune is part of the Intel Parallel Studio.
Documentation
- Intel documentation
- Introduction to VTune
- Tutorials are available at https://software.intel.com/en-us/articles/intel-vtune-amplifier-tutorials
Usage on Bridges-2
VTune is part of the Intel Parallel Studio. You can see all the versions of the Intel Studio that are available by typing
module avail intel
and you can see specifics about all the components by typing
module help intel/nnnn
where nnnn is the version number for the specific module you are interested in.
To use VTune:
- Run VTune in an batch job on Bridges-2 that profiles your code
- Connect to Bridges-2 using an ssh client with X11 forwarding enabled
- Examine the output from VTune
-
-
1. Run VTune in a batch job that profiles your code
- Prepare a job script which contains the commands to profile your code with VTune.
- If you want a different version of the Intel Parallel Studio than the default, load the version you want.
module load intel/nnnn
- If necessary, compile your code.
- Profile your code by running amplxe-cl and passing your code to it. The amplxe-cl command is defined by the intel module.For a non-MPI code, the command will look like:
amplxe-cl -result-dir dirname -quiet -collect hotspots ./your-executable arguments-to-your-executable
If you are profiling an MPI code, use mpirun to execute amplxe-cl
mpirun -np X amplxe-cl -result-dir dirname -quiet -collect hotspots ./your-executable arguments-to-your-executable
In either case, dirname is any name you choose. The data from VTune will be stored in a new directory(s) named dirname. VTune will create a directory for each node you are using.
The profiling data will be stored in a file with the same name as the directory and the extension ‘amplxe’.
- If you want a different version of the Intel Parallel Studio than the default, load the version you want.
- Submit your batch script to Bridges-2 with the
sbatch
command and be sure to use the PERF flag to enable profiling. Substitute the partition name and the name of your job script for partition and job-script.sbatch -p partition -C PERF job-script
See the Running Jobs section of the Bridges-2 User Guide for more information about partitions, running batch jobs, and options to the sbatch command.
- Prepare a job script which contains the commands to profile your code with VTune.
-
-
2. Connect to Bridges-2 with X11 forwarding enabled
In order to use the GUI supplied with the VTune client to examine the output, you must be connected to Bridges-2 using ssh with X11 forwarding enabled. This is important – otherwise, you will not be able to use the GUI. If you are logged in to Bridges-2 without X11 forwarding enabled, log out and back in with X11 enabled.
-
3. Examine the profiling information
- Start an interactive session with the
interact
command.interact -n X
where X is the number of cores that you want to use. This will request resources in the RM-small partition for 60 minutes. You can override these defaults by using other options to the
interact
command. - Load the intel version that you want.
module load intel/nnnn
- Move to the new directory created by amplxe-cl
cd dirname
- Start the GUI and open the .amplxe file.
amplxe-gui dirname.amplxe
The GUI has many options. Information on using it is available at https://software.intel.com/en-us/node/543997
- Start an interactive session with the
More information
The tutorials referenced above will take you step-by-step through analyzing code performance. Sample codes are included.