diff options
author | Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> | 2023-10-10 14:33:42 +0000 |
---|---|---|
committer | Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> | 2023-10-10 14:33:42 +0000 |
commit | af1a266670d040d2f4083ff309d732d648afba2a (patch) | |
tree | 2fc46203448ddcc6f81546d379abfaeb323575e9 /meson/docs/markdown/Cuda-module.md | |
parent | e02cda008591317b1625707ff8e115a4841aa889 (diff) |
Change-Id: Iaf8d18082d3991dec7c0ebbea540f092188eb4ec
Diffstat (limited to 'meson/docs/markdown/Cuda-module.md')
-rw-r--r-- | meson/docs/markdown/Cuda-module.md | 186 |
1 files changed, 186 insertions, 0 deletions
diff --git a/meson/docs/markdown/Cuda-module.md b/meson/docs/markdown/Cuda-module.md new file mode 100644 index 000000000..24a607a72 --- /dev/null +++ b/meson/docs/markdown/Cuda-module.md @@ -0,0 +1,186 @@ +--- +short-description: CUDA module +authors: + - name: Olexa Bilaniuk + years: [2019] + has-copyright: false +... + +# Unstable CUDA Module +_Since: 0.50.0_ + +This module provides helper functionality related to the CUDA Toolkit and +building code using it. + + +**Note**: this module is unstable. It is only provided as a technology preview. +Its API may change in arbitrary ways between releases or it might be removed +from Meson altogether. + + +## Importing the module + +The module may be imported as follows: + +``` meson +cuda = import('unstable-cuda') +``` + +It offers several useful functions that are enumerated below. + + +## Functions + +### `nvcc_arch_flags()` +_Since: 0.50.0_ + +``` meson +cuda.nvcc_arch_flags(cuda_version_string, ..., + detected: string_or_array) +``` + +Returns a list of `-gencode` flags that should be passed to `cuda_args:` in +order to compile a "fat binary" for the architectures/compute capabilities +enumerated in the positional argument(s). The flags shall be acceptable to +an NVCC with CUDA Toolkit version string `cuda_version_string`. + +A set of architectures and/or compute capabilities may be specified by: + +- The single positional argument `'All'`, `'Common'` or `'Auto'` +- As (an array of) + - Architecture names (`'Kepler'`, `'Maxwell+Tegra'`, `'Turing'`) and/or + - Compute capabilities (`'3.0'`, `'3.5'`, `'5.3'`, `'7.5'`) + +A suffix of `+PTX` requests PTX code generation for the given architecture. +A compute capability given as `A.B(X.Y)` requests PTX generation for an older +virtual architecture `X.Y` before binary generation for a newer architecture +`A.B`. + +Multiple architectures and compute capabilities may be passed in using + +- Multiple positional arguments +- Lists of strings +- Space (` `), comma (`,`) or semicolon (`;`)-separated strings + +The single-word architectural sets `'All'`, `'Common'` or `'Auto'` +cannot be mixed with architecture names or compute capabilities. Their +interpretation is: + +| Name | Compute Capability | +|-------------------|--------------------| +| `'All'` | All CCs supported by given NVCC compiler. | +| `'Common'` | Relatively common CCs supported by given NVCC compiler. Generally excludes Tegra and Tesla devices. | +| `'Auto'` | The CCs provided by the `detected:` keyword, filtered for support by given NVCC compiler. | + +The supported architecture names and their corresponding compute capabilities +are: + +| Name | Compute Capability | +|-------------------|--------------------| +| `'Fermi'` | 2.0, 2.1(2.0) | +| `'Kepler'` | 3.0, 3.5 | +| `'Kepler+Tegra'` | 3.2 | +| `'Kepler+Tesla'` | 3.7 | +| `'Maxwell'` | 5.0, 5.2 | +| `'Maxwell+Tegra'` | 5.3 | +| `'Pascal'` | 6.0, 6.1 | +| `'Pascal+Tegra'` | 6.2 | +| `'Volta'` | 7.0 | +| `'Xavier'` | 7.2 | +| `'Turing'` | 7.5 | +| `'Ampere'` | 8.0, 8.6 | + + +Examples: + + cuda.nvcc_arch_flags('10.0', '3.0', '3.5', '5.0+PTX') + cuda.nvcc_arch_flags('10.0', ['3.0', '3.5', '5.0+PTX']) + cuda.nvcc_arch_flags('10.0', [['3.0', '3.5'], '5.0+PTX']) + cuda.nvcc_arch_flags('10.0', '3.0 3.5 5.0+PTX') + cuda.nvcc_arch_flags('10.0', '3.0,3.5,5.0+PTX') + cuda.nvcc_arch_flags('10.0', '3.0;3.5;5.0+PTX') + cuda.nvcc_arch_flags('10.0', 'Kepler 5.0+PTX') + # Returns ['-gencode', 'arch=compute_30,code=sm_30', + # '-gencode', 'arch=compute_35,code=sm_35', + # '-gencode', 'arch=compute_50,code=sm_50', + # '-gencode', 'arch=compute_50,code=compute_50'] + + cuda.nvcc_arch_flags('10.0', '3.5(3.0)') + # Returns ['-gencode', 'arch=compute_30,code=sm_35'] + + cuda.nvcc_arch_flags('8.0', 'Common') + # Returns ['-gencode', 'arch=compute_30,code=sm_30', + # '-gencode', 'arch=compute_35,code=sm_35', + # '-gencode', 'arch=compute_50,code=sm_50', + # '-gencode', 'arch=compute_52,code=sm_52', + # '-gencode', 'arch=compute_60,code=sm_60', + # '-gencode', 'arch=compute_61,code=sm_61', + # '-gencode', 'arch=compute_61,code=compute_61'] + + cuda.nvcc_arch_flags('9.2', 'Auto', detected: '6.0 6.0 6.0 6.0') + cuda.nvcc_arch_flags('9.2', 'Auto', detected: ['6.0', '6.0', '6.0', '6.0']) + # Returns ['-gencode', 'arch=compute_60,code=sm_60'] + + cuda.nvcc_arch_flags(nvcc, 'All') + # Returns ['-gencode', 'arch=compute_20,code=sm_20', + # '-gencode', 'arch=compute_20,code=sm_21', + # '-gencode', 'arch=compute_30,code=sm_30', + # '-gencode', 'arch=compute_32,code=sm_32', + # '-gencode', 'arch=compute_35,code=sm_35', + # '-gencode', 'arch=compute_37,code=sm_37', + # '-gencode', 'arch=compute_50,code=sm_50', # nvcc.version() < 7.0 + # '-gencode', 'arch=compute_52,code=sm_52', + # '-gencode', 'arch=compute_53,code=sm_53', # nvcc.version() >= 7.0 + # '-gencode', 'arch=compute_60,code=sm_60', + # '-gencode', 'arch=compute_61,code=sm_61', # nvcc.version() >= 8.0 + # '-gencode', 'arch=compute_70,code=sm_70', + # '-gencode', 'arch=compute_72,code=sm_72', # nvcc.version() >= 9.0 + # '-gencode', 'arch=compute_75,code=sm_75'] # nvcc.version() >= 10.0 + +_Note:_ This function is intended to closely replicate CMake's FindCUDA module +function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, [list of CUDA compute architectures])` + + + +### `nvcc_arch_readable()` +_Since: 0.50.0_ + +``` meson +cuda.nvcc_arch_readable(cuda_version_string, ..., + detected: string_or_array) +``` + +Has precisely the same interface as [`nvcc_arch_flags()`](#nvcc_arch_flags), +but rather than returning a list of flags, it returns a "readable" list of +architectures that will be compiled for. The output of this function is solely +intended for informative message printing. + + archs = '3.0 3.5 5.0+PTX' + readable = cuda.nvcc_arch_readable('10.0', archs) + message('Building for architectures ' + ' '.join(readable)) + +This will print + + Message: Building for architectures sm30 sm35 sm50 compute50 + +_Note:_ This function is intended to closely replicate CMake's +FindCUDA module function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, +[list of CUDA compute architectures])` + + + +### `min_driver_version()` +_Since: 0.50.0_ + +``` meson +cuda.min_driver_version(cuda_version_string) +``` + +Returns the minimum NVIDIA proprietary driver version required, on the +host system, by kernels compiled with a CUDA Toolkit with the given +version string. + +The output of this function is generally intended for informative +message printing, but could be used for assertions or to conditionally +enable features known to exist within the minimum NVIDIA driver +required. |