Copyright: Maximilian Behr
License: The software is licensed under under MIT. See LICENSE
for details.
cuPolar
is a CUDA
library implementing Newton and Halley's method for the Polar Decomposition of a nonsingular square matrix
cuPolar
supports real and complex, single and double precision matrices.
int cupolar_sNewtonBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_sNewton(const int n, const float *A, void *d_buffer, void *h_buffer, float *Q, float *H);
int cupolar_sHalleyBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_sHalley(const int n, const float *A, void *d_buffer, void *h_buffer, float *Q, float *H);
int cupolar_dNewtonBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_dNewton(const int n, const double *A, void *d_buffer, void *h_buffer, double *Q, double *H);
int cupolar_dHalleyBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_dHalley(const int n, const double *A, void *d_buffer, void *h_buffer, double *Q, double *H);
int cupolar_cNewtonBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_cNewton(const int n, const cuComplex *A, void *d_buffer, void *h_buffer, cuComplex *Q, cuComplex *H);
int cupolar_cHalleyBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_cHalley(const int n, const cuComplex *A, void *d_buffer, void *h_buffer, cuComplex *Q, cuComplex *H);
int cupolar_zNewtonBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_zNewton(const int n, const cuDoubleComplex *A, void *d_buffer, void *h_buffer, cuDoubleComplex *Q, cuDoubleComplex *H);
int cupolar_zHalleyBufferSize(const int n, size_t *d_bufferSize, size_t *h_bufferSize);
int cupolar_zHalley(const int n, const cuDoubleComplex *A, void *d_buffer, void *h_buffer, cuDoubleComplex *Q, cuDoubleComplex *H);
cuPolar
implements the Newton's method with scaling as well as Halley's method with scaling for the approximation of the polar decomposition.
The matrix
See also Algorithm 3.1
Byers, R., & Xu, H. (2008). A new scaling for Newton's iteration for the polar decomposition and its backward stability. SIAM Journal on Matrix Analysis and Applications, 30(2), 822-843.
and Equations 3.14-3.16
Nakatsukasa, Y., Bai, Z., & Gygi, F. (2010). Optimizing Halley's iteration for computing the matrix polar decomposition. SIAM Journal on Matrix Analysis and Applications, 31(5), 2700-2720.
Prerequisites:
CMake >= 3.23
CUDA >= 11.4.2
mkdir build && cd build
cmake ..
make
make install
We provide examples for all supported matrix formats:
File | Data |
---|---|
example_cupolar_sNewton.cu |
real, single precision matrix |
example_cupolar_dNewton.cu |
real, double precision matrix |
example_cupolar_cNewton.cu |
complex, single precision matrix |
example_cupolar_zNewton.cu |
complex, double precision matrix |
example_cupolar_sHalley.cu |
real, single precision matrix |
example_cupolar_dHalley.cu |
real, double precision matrix |
example_cupolar_cHalley.cu |
complex, single precision matrix |
example_cupolar_zHalley.cu |
complex, double precision matrix |