Add submodule dependency filesHEAD master

Change-Id: Iaf8d18082d3991dec7c0ebbea540f092188eb4ec
author: Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> 2023-10-10 14:33:42 +0000
committer: Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com> 2023-10-10 14:33:42 +0000
commit: af1a266670d040d2f4083ff309d732d648afba2a (patch)
tree: 2fc46203448ddcc6f81546d379abfaeb323575e9 /meson/docs/markdown/Cuda-module.md
parent: e02cda008591317b1625707ff8e115a4841aa889 (diff)
1 files changed, 186 insertions, 0 deletions
diff --git a/meson/docs/markdown/Cuda-module.md b/meson/docs/markdown/Cuda-module.md
new file mode 100644
index 000000000..24a607a72
--- /dev/null
+++ b/meson/docs/markdown/Cuda-module.md
@@ -0,0 +1,186 @@
+---
+short-description: CUDA module
+authors:
+    - name: Olexa Bilaniuk
+      years: [2019]
+      has-copyright: false
+...
+
+# Unstable CUDA Module
+_Since: 0.50.0_
+
+This module provides helper functionality related to the CUDA Toolkit and
+building code using it.
+
+
+**Note**: this module is unstable. It is only provided as a technology preview.
+Its API may change in arbitrary ways between releases or it might be removed
+from Meson altogether.
+
+
+## Importing the module
+
+The module may be imported as follows:
+
+``` meson
+cuda = import('unstable-cuda')
+```
+
+It offers several useful functions that are enumerated below.
+
+
+## Functions
+
+### `nvcc_arch_flags()`
+_Since: 0.50.0_
+
+``` meson
+cuda.nvcc_arch_flags(cuda_version_string, ...,
+                     detected: string_or_array)
+```
+
+Returns a list of `-gencode` flags that should be passed to `cuda_args:` in
+order to compile a "fat binary" for the architectures/compute capabilities
+enumerated in the positional argument(s). The flags shall be acceptable to
+an NVCC with CUDA Toolkit version string `cuda_version_string`.
+
+A set of architectures and/or compute capabilities may be specified by:
+
+- The single positional argument `'All'`, `'Common'` or `'Auto'`
+- As (an array of)
+  - Architecture names (`'Kepler'`, `'Maxwell+Tegra'`, `'Turing'`) and/or
+  - Compute capabilities (`'3.0'`, `'3.5'`, `'5.3'`, `'7.5'`)
+
+A suffix of `+PTX` requests PTX code generation for the given architecture.
+A compute capability given as `A.B(X.Y)` requests PTX generation for an older
+virtual architecture `X.Y` before binary generation for a newer architecture
+`A.B`.
+
+Multiple architectures and compute capabilities may be passed in using
+
+- Multiple positional arguments
+- Lists of strings
+- Space (` `), comma (`,`) or semicolon (`;`)-separated strings
+
+The single-word architectural sets `'All'`, `'Common'` or `'Auto'`
+cannot be mixed with architecture names or compute capabilities. Their
+interpretation is:
+
+| Name              | Compute Capability |
+|-------------------|--------------------|
+| `'All'`           | All CCs supported by given NVCC compiler. |
+| `'Common'`        | Relatively common CCs supported by given NVCC compiler. Generally excludes Tegra and Tesla devices. |
+| `'Auto'`          | The CCs provided by the `detected:` keyword, filtered for support by given NVCC compiler. |
+
+The supported architecture names and their corresponding compute capabilities
+are:
+
+| Name              | Compute Capability |
+|-------------------|--------------------|
+| `'Fermi'`         | 2.0, 2.1(2.0)      |
+| `'Kepler'`        | 3.0, 3.5           |
+| `'Kepler+Tegra'`  | 3.2                |
+| `'Kepler+Tesla'`  | 3.7                |
+| `'Maxwell'`       | 5.0, 5.2           |
+| `'Maxwell+Tegra'` | 5.3                |
+| `'Pascal'`        | 6.0, 6.1           |
+| `'Pascal+Tegra'`  | 6.2                |
+| `'Volta'`         | 7.0                |
+| `'Xavier'`        | 7.2                |
+| `'Turing'`        | 7.5                |
+| `'Ampere'`        | 8.0, 8.6           |
+
+
+Examples:
+
+    cuda.nvcc_arch_flags('10.0', '3.0', '3.5', '5.0+PTX')
+    cuda.nvcc_arch_flags('10.0', ['3.0', '3.5', '5.0+PTX'])
+    cuda.nvcc_arch_flags('10.0', [['3.0', '3.5'], '5.0+PTX'])
+    cuda.nvcc_arch_flags('10.0', '3.0 3.5 5.0+PTX')
+    cuda.nvcc_arch_flags('10.0', '3.0,3.5,5.0+PTX')
+    cuda.nvcc_arch_flags('10.0', '3.0;3.5;5.0+PTX')
+    cuda.nvcc_arch_flags('10.0', 'Kepler 5.0+PTX')
+    # Returns ['-gencode', 'arch=compute_30,code=sm_30',
+    #          '-gencode', 'arch=compute_35,code=sm_35',
+    #          '-gencode', 'arch=compute_50,code=sm_50',
+    #          '-gencode', 'arch=compute_50,code=compute_50']
+
+    cuda.nvcc_arch_flags('10.0', '3.5(3.0)')
+    # Returns ['-gencode', 'arch=compute_30,code=sm_35']
+
+    cuda.nvcc_arch_flags('8.0', 'Common')
+    # Returns ['-gencode', 'arch=compute_30,code=sm_30',
+    #          '-gencode', 'arch=compute_35,code=sm_35',
+    #          '-gencode', 'arch=compute_50,code=sm_50',
+    #          '-gencode', 'arch=compute_52,code=sm_52',
+    #          '-gencode', 'arch=compute_60,code=sm_60',
+    #          '-gencode', 'arch=compute_61,code=sm_61',
+    #          '-gencode', 'arch=compute_61,code=compute_61']
+
+    cuda.nvcc_arch_flags('9.2', 'Auto', detected: '6.0 6.0 6.0 6.0')
+    cuda.nvcc_arch_flags('9.2', 'Auto', detected: ['6.0', '6.0', '6.0', '6.0'])
+    # Returns ['-gencode', 'arch=compute_60,code=sm_60']
+
+    cuda.nvcc_arch_flags(nvcc, 'All')
+    # Returns ['-gencode', 'arch=compute_20,code=sm_20',
+    #          '-gencode', 'arch=compute_20,code=sm_21',
+    #          '-gencode', 'arch=compute_30,code=sm_30',
+    #          '-gencode', 'arch=compute_32,code=sm_32',
+    #          '-gencode', 'arch=compute_35,code=sm_35',
+    #          '-gencode', 'arch=compute_37,code=sm_37',
+    #          '-gencode', 'arch=compute_50,code=sm_50', # nvcc.version()  <  7.0
+    #          '-gencode', 'arch=compute_52,code=sm_52',
+    #          '-gencode', 'arch=compute_53,code=sm_53', # nvcc.version() >=  7.0
+    #          '-gencode', 'arch=compute_60,code=sm_60',
+    #          '-gencode', 'arch=compute_61,code=sm_61', # nvcc.version() >=  8.0
+    #          '-gencode', 'arch=compute_70,code=sm_70',
+    #          '-gencode', 'arch=compute_72,code=sm_72', # nvcc.version() >=  9.0
+    #          '-gencode', 'arch=compute_75,code=sm_75'] # nvcc.version() >= 10.0
+
+_Note:_ This function is intended to closely replicate CMake's FindCUDA module
+function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable, [list of CUDA compute architectures])`
+
+
+
+### `nvcc_arch_readable()`
+_Since: 0.50.0_
+
+``` meson
+cuda.nvcc_arch_readable(cuda_version_string, ...,
+                        detected: string_or_array)
+```
+
+Has precisely the same interface as [`nvcc_arch_flags()`](#nvcc_arch_flags),
+but rather than returning a list of flags, it returns a "readable" list of
+architectures that will be compiled for. The output of this function is solely
+intended for informative message printing.
+
+    archs    = '3.0 3.5 5.0+PTX'
+    readable = cuda.nvcc_arch_readable('10.0', archs)
+    message('Building for architectures ' + ' '.join(readable))
+
+This will print
+
+    Message: Building for architectures sm30 sm35 sm50 compute50
+
+_Note:_ This function is intended to closely replicate CMake's
+FindCUDA module function `CUDA_SELECT_NVCC_ARCH_FLAGS(out_variable,
+[list of CUDA compute architectures])`
+
+
+
+### `min_driver_version()`
+_Since: 0.50.0_
+
+``` meson
+cuda.min_driver_version(cuda_version_string)
+```
+
+Returns the minimum NVIDIA proprietary driver version required, on the
+host system, by kernels compiled with a CUDA Toolkit with the given
+version string.
+
+The output of this function is generally intended for informative
+message printing, but could be used for assertions or to conditionally
+enable features known to exist within the minimum NVIDIA driver
+required.
author	Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com>	2023-10-10 14:33:42 +0000
committer	Angelos Mouzakitis <a.mouzakitis@virtualopensystems.com>	2023-10-10 14:33:42 +0000
commit	af1a266670d040d2f4083ff309d732d648afba2a (patch)
tree	2fc46203448ddcc6f81546d379abfaeb323575e9 /meson/docs/markdown/Cuda-module.md
parent	e02cda008591317b1625707ff8e115a4841aa889 (diff)