Skip to content

Commit

Permalink
Update Changelog for 0.3.25 (#4314)
Browse files Browse the repository at this point in the history
* Update Changelog.txt for 0.3.25
  • Loading branch information
martin-frbg authored Nov 12, 2023
1 parent fa61596 commit c245c12
Showing 1 changed file with 46 additions and 0 deletions.
46 changes: 46 additions & 0 deletions Changelog.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,50 @@
OpenBLAS ChangeLog
====================================================================
Version 0.3.25
12-Nov-2023

general:
- improved the error message shown on exceeding the maximum thread count
- improved the code to add supplementary thread buffers in case of overflow
- fixed a potential division by zero in ?ROTG
- improved the ?MATCOPY functions to accept zero-sized rows or columns
- corrected empty prototypes in function declarations
- cleaned up unused declarations in the f2c-converted versions of the LAPACK sources
- fixed compilation with the Cray CCE Compiler suite
- improved link line rewriting to avoid mixed libgomp/libomp builds with clang&gfortran
- worked around OPENMP builds with LLVM14's libomp hanging on FreeBSD
- improved the Makefiles to require less option duplication on "make install"
- imported the following changes from the upcoming release 3.12 of Reference-LAPACK
- deprecate utility functions ?GELQS and ?GEQRS (LAPACK PR 900)
- apply rounding up to workspace calculations done in floating point (LAPACK PR 904)
- avoid overflow in STGEX2/DTGEX2 (LAPACK PR 907)
- fix accumulation in ?LASSQ (LAPACK PR 909)
- fix handling of NaN values in ?GECON (LAPACK PR 926)
- avoid overflow in CBDSQR/ZBDSQR (LAPACK PR 927)
- fix poor vector orthogonalizations in ?ORBDB5/?UNBDB5 (LAPACK PR 928 & 930)

x86-64:
- fixed compile-time autodetection of AMD Ryzen3 and Ryzen4 cpus
- fixed capability-based fallback selection for unknown cpus in DYNAMIC_ARCH
- added AVX512 optimizations for ?ASUM on Sapphire Rapids and Cooper Lake

ARM64:
- fixed building on Apple with homebrew gcc
- fixed building with XCODE 15
- fixed building on A64FX and Cortex A710/X1/X2
- increased the default buffer size for recent ARM server cpus

POWER:
- fixed building with the IBM xlf 16.1.1 compiler
- fixed building with IBM XL C
- added support for DYNAMIC_ARCH builds with clang
- fixed union declaration in the BFLOAT16 test case
- enable optimizations for the AIX assembler on POWER10

LOONGARCH64:
- added an optimized SGEMV kernel
- added an optimized DTRSM kernel

====================================================================
Version 0.3.24
03-Sep-2023
Expand Down

0 comments on commit c245c12

Please sign in to comment.