diff options
Diffstat (limited to 'meson/docs/markdown/Simd-module.md')
-rw-r--r-- | meson/docs/markdown/Simd-module.md | 72 |
1 files changed, 72 insertions, 0 deletions
diff --git a/meson/docs/markdown/Simd-module.md b/meson/docs/markdown/Simd-module.md new file mode 100644 index 000000000..29f3e952d --- /dev/null +++ b/meson/docs/markdown/Simd-module.md @@ -0,0 +1,72 @@ +# Unstable SIMD module + +This module provides helper functionality to build code with SIMD instructions. +Available since 0.42.0. + +**Note**: this module is unstable. It is only provided as a technology +preview. Its API may change in arbitrary ways between releases or it +might be removed from Meson altogether. + +## Usage + +This module is designed for the use case where you have an algorithm +with one or more SIMD implementation and you choose which one to use +at runtime. + +The module provides one method, `check`, which is used like this: + + rval = simd.check('mysimds', + mmx : 'simd_mmx.c', + sse : 'simd_sse.c', + sse2 : 'simd_sse2.c', + sse3 : 'simd_sse3.c', + ssse3 : 'simd_ssse3.c', + sse41 : 'simd_sse41.c', + sse42 : 'simd_sse42.c', + avx : 'simd_avx.c', + avx2 : 'simd_avx2.c', + neon : 'simd_neon.c', + compiler : cc) + +Here the individual files contain the accelerated versions of the +functions in question. The `compiler` keyword argument takes the +compiler you are going to use to compile them. The function returns an +array with two values. The first value is a bunch of libraries that +contain the compiled code. Any SIMD code that the compiler can't +compile (for example, Neon instructions on an x86 machine) are +ignored. You should pass this value to the desired target using +`link_with`. The second value is a `configuration_data` object that +contains true for all the values that were supported. For example if +the compiler did support sse2 instructions, then the object would have +`HAVE_SSE2` set to 1. + +Generating code to detect the proper instruction set at runtime is +straightforward. First you create a header with the configuration +object and then a chooser function that looks like this: + + void (*fptr)(type_of_function_here) = NULL; + + #if HAVE_NEON + if(fptr == NULL && neon_available()) { + fptr = neon_accelerated_function; + } + #endif + #if HAVE_AVX2 + if(fptr == NULL && avx2_available()) { + fptr = avx_accelerated_function; + } + #endif + + ... + + if(fptr == NULL) { + fptr = default_function; + } + +Each source file provides two functions, the `xxx_available` function +to query whether the CPU currently in use supports the instruction set +and `xxx_accelerated_function` that is the corresponding accelerated +implementation. + +At the end of this function the function pointer points to the fastest +available implementation and can be invoked to do the computation. |