aboutsummaryrefslogtreecommitdiffstats
path: root/meson/docs/markdown/Cuda-module.md
diff options
context:
space:
mode:
authorAngelos Mouzakitis <a.mouzakitis@virtualopensystems.com>2023-10-10 14:33:42 +0000
committerAngelos Mouzakitis <a.mouzakitis@virtualopensystems.com>2023-10-10 14:33:42 +0000
commitaf1a266670d040d2f4083ff309d732d648afba2a (patch)
tree2fc46203448ddcc6f81546d379abfaeb323575e9 /meson/docs/markdown/Cuda-module.md
parente02cda008591317b1625707ff8e115a4841aa889 (diff)
Add submodule dependency filesHEADmaster
Change-Id: Iaf8d18082d3991dec7c0ebbea540f092188eb4ec
Diffstat (limited to 'meson/docs/markdown/Cuda-module.md')
-rw-r--r--meson/docs/markdown/Cuda-module.md186
1 files changed, 186 insertions, 0 deletions
diff --git a/meson/docs/markdown/Cuda-module.md b/meson/docs/markdown/Cuda-module.md
new file mode 100644
index 000000000..24a607a72
--- /dev/null
+++ b/meson/docs/markdown/Cuda-module.md
@@ -0,0 +1,186 @@
+---
+short-description: CUDA module
+authors:
+ - name: Olexa Bilaniuk
+ years: [2019]
+ has-copyright: false
+...
+
+# Unstable CUDA Module
+_Since: 0.50.0_
+
+This module provides helper functionality related to the CUDA Toolkit and
+building code using it.
+
+
+**Note**: this module is unstable. It is only provided as a technology preview.
+Its API may change in arbitrary ways between releases or it might be removed
+from Meson altogether.
+
+
+## Importing the module
+
+The module may be imported as follows:
+
+``` meson
+cuda = import('unstable-cuda')
+```
+
+It offers several useful functions that are enumerated below.
+
+
+## Functions
+
+### `nvcc_arch_flags()`
+_Since: 0.50.0_
+
+``` meson
+cuda.nvcc_arch_flags(cuda_version_string, ...,
+ detected: string_or_array)
+```
+
+Returns a list of `-gencode` flags that should be passed to `cuda_args:` in
+order to compile a "fat binary" for the architectures/compute capabilities
+enumerated in the positional argument(s). The flags shall be acceptable to
+an NVCC with CUDA Toolkit version string `cuda_version_string`.
+
+A set of architectures and/or compute capabilities may be specified by:
+
+- The single positional argument `'All'`, `'Common'` or `'Auto'`
+- As (an array of)
+ - Architecture names (`'Kepler'`, `'Maxwell+Tegra'`, `'Turing'`) and/or
+ - Compute capabilities (`'3.0'`, `'3.5'`, `'5.3'`, `'7.5'`)
+
+A suffix of `+PTX` requests PTX code generation for the given architecture.
+A compute capability given as `A.B(X.Y)` requests PTX generation for an older
+virtual architecture `X.Y` before binary generation for a newer architecture
+`A.B`.
+
+Multiple architectures and compute capabilities may be passed in using
+
+- Multiple positional arguments
+- Lists of strings
+- Space (` `), comma (`,`) or semicolon (`;`)-separated strings
+
+The single-word architectural sets `'All'`, `'Common'` or `'Auto'`
+cannot be mixed with architecture names or compute capabilities. Their
+interpretation is:
+
+| Name | Compute Capability |
+|-------------------|--------------------|
+| `'All'` | All CCs supported by given NVCC compiler. |
+| `'Common'` | Relatively common CCs supported by given NVCC compiler. Generally excludes Tegra and Tesla devices. |
+| `'Auto'` | The CCs provided by the `detected:` keyword, filtered for support by given NVCC compiler. |
+
+The supported architecture names and their corresponding compute capabilities
+are:
+
+| Name | Compute Capability |
+|-------------------|--------------------|
+| `'Fermi'` | 2.0, 2.1(2.0) |
+| `'Kepler'` | 3.0, 3.5 |
+| `'Kepler+Tegra'` | 3.2 |
+| `'Kepler+Tesla'` | 3.7 |
+| `'Maxwell'` | 5.0, 5.2 |
+| `'Maxwell+Tegra'` | 5.3 |
+| `'Pascal'` | 6.0, 6.1 |
+| `'Pascal+Tegra'` | 6.2 |
+| `'Volta'` | 7.0 |
+| `'Xavier'` | 7.2 |
+| `'Turing'` | 7.5 |
+| `'Ampere'` | 8.0, 8.6 |
+
+
+Examples:
+
+ cuda.nvcc_arch_flags('10.0', '3.0', '3.5', '5.0+PTX')
+ cuda.nvcc_arch_flags('10.0', ['3.0', '3.5', '5.0+PTX'])
+ cuda.nvcc_arch_flags('10.0', [['3.0', '3.5'], '5.0+PTX'])
+ cuda.nvcc_arch_flags('10.0', '3.0 3.5 5.0+PTX')
+ cuda.nvcc_arch_flags('10.0', '3.0,3.5,5.0+PTX')
+ cuda.nvcc_arch_flags('10.0', '3.0;3.5;5.0+PTX')
+ cuda.nvcc_arch_flags('10.0', 'Kepler 5.0+PTX')
+ # Returns ['-gencode', 'arch=compute_30,code=sm_30',
+ # '-gencode', 'arch=compute_35,code=sm_35',
+ # '-gencode', 'arch=compute_50,code=sm_50',
+ # '-gencode', 'arch=compute_50,code=compute_50']
+
+ cuda.nvcc_arch_flags('10.0', '3.5(3.0)')
+ # Returns ['-gencode', 'arch=compute_30,code=sm_35']
+
+ cuda.nvcc_arch_flags('8.0', 'Common')
+ # Returns ['-gencode', 'arch=compute_30,code=sm_30',
+ # '-gencode', 'arch=compute_35,code=sm_35',
+ # '-gencode', 'arch=compute_50,code=sm_50',
+ # '-gencode', 'arch=compute_52,code=sm_52',
+ # '-gencode', 'arch=compute_60,code=sm_60',
+ # '-gencode', 'arch=compute_61,code=sm_61',
+ # '-gencode', 'arch=compute_61,code=compute_61']
+
+ cuda.nvcc_arch_flags('9.2', 'Auto', detected: '6.0 6.0 6.0 6.0')
+ cuda.nvcc_arch_flags('9.2', 'Auto', detected: ['6.0', '6.0', '6.0', '6.0'])
+ # Returns ['-gencode', 'arch=compute_60,code=sm_60']
+
+ cuda.nvcc_arch_flags(nvcc, 'All')
+ # Returns ['-gencode', 'arch=compute_20,code=sm_20',
+ # '-gencode', 'arch=compute_20,code=sm_21',
+ # '-gencode', 'arch=compute_30,code=sm_30',
+ # '-gencode', 'arch=compute_32,code=sm_32',
+ # '-gencode', 'arch=compute_35,code=sm_35',
+ # '-gencode', 'arch=compute_37,code=sm_37',
+ # '-gencode', 'arch=compute_50,code=sm_50', # nvcc.version() < 7.0
+ # '-gencode', 'arch=compute_52,code=sm_52',
+ # '-gencode', 'arch=compute_53,code=sm_53', # nvcc.version() >= 7.0
+ # '-gencode', 'arch=compute_60,code=sm_60',
+ # '-gencode', 'arch=compute_61,code=sm_61', # nvcc.version() >= 8.0
+ # '-gencode', 'arch=compute_70,code=sm_70',
+ # '-gencode', 'arch=compute_72,code=sm_72', # nvcc.version() >= 9.0
+ # '-gencode', 'arch=compute_75,code=sm_75'] # nvcc.version() >= 10.0
+
+_Note:_ This function is intended to closely replicate CMake's FindCUDA module
+function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, [list of CUDA compute architectures])`
+
+
+
+### `nvcc_arch_readable()`
+_Since: 0.50.0_
+
+``` meson
+cuda.nvcc_arch_readable(cuda_version_string, ...,
+ detected: string_or_array)
+```
+
+Has precisely the same interface as [`nvcc_arch_flags()`](#nvcc_arch_flags),
+but rather than returning a list of flags, it returns a "readable" list of
+architectures that will be compiled for. The output of this function is solely
+intended for informative message printing.
+
+ archs = '3.0 3.5 5.0+PTX'
+ readable = cuda.nvcc_arch_readable('10.0', archs)
+ message('Building for architectures ' + ' '.join(readable))
+
+This will print
+
+ Message: Building for architectures sm30 sm35 sm50 compute50
+
+_Note:_ This function is intended to closely replicate CMake's
+FindCUDA module function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable,
+[list of CUDA compute architectures])`
+
+
+
+### `min_driver_version()`
+_Since: 0.50.0_
+
+``` meson
+cuda.min_driver_version(cuda_version_string)
+```
+
+Returns the minimum NVIDIA proprietary driver version required, on the
+host system, by kernels compiled with a CUDA Toolkit with the given
+version string.
+
+The output of this function is generally intended for informative
+message printing, but could be used for assertions or to conditionally
+enable features known to exist within the minimum NVIDIA driver
+required.