CLBlast 1.6.2
CLBlast version 1.6.2. Changes since previous release (version 1.6.1):
- Fix a bug in the pre-processor that would cause issues on Arm GPUs
- Fix DLL install directory in mingw
- Modifications to the Python bindings (pyclblast)
- Convert float scalar values to cl_half for fp16 routines
- Amax/amin, max/min routines accept unsigned integer buffers for index
- Switch to pyproject.toml file for installing Python bindings
- Build Python bindings using Cmake, adding Windows support
- Generator script now always use LF endings, independent of the platform
- Added tuned parameters for many devices (see doc/tuning.md)