BLAS_GEMM

The BLAS_GEMM procedure updates an existing matrix by adding a multiple of the product of two other matrices, according to the following vector operation:

M = alpha * op(A) * op(B) + beta * M

where alpha and beta are scale factors, A, B, and M are input matrices, and op(X) is one of X, X^T, or X^H. Here the * operator is the mathematician's matrix multiplication, rows of the first matrix times columns of the second, which is the same as the ## operator in IDL.

BLAS_GEMM is based on the following BLAS routines:

Output Type	BLAS Routine
Float	sgemm
Double	dgemm
Complex	cgemm
Double complex	zgemm

Note: On x86_64 machines, BLAS_GEMM uses MKL, so it only supports output of type float, double, complex, or dcomplex. The input arguments can be integer types, at which point they will be converted to the type of the output argument C before being multiplied.

Note: On arm64 Mac, BLAS_GEMM uses Apple's Accelerate framework.

BLAS_GEMM can be faster and use less memory than the usual IDL array notation (e.g., M += A ## B) for updating existing arrays.

The # Operator vs. MATRIX_MULTIPLY vs. BLAS_GEMM

The following table illustrates how various operations are performed using the # operator versus the MATRIX_MULTIPLY function and BLAS_GEMM procedure:

# Operator	MATRIX_MULTIPLY Function	BLAS_GEMM Procedure
M = A # B	M = matrix_multiply(A, B)	blas_gemm, M, B, A
M = transpose(A) # B	M = matrix_multiply(A, B, /atranspose)	blas_gemm, M, B, A, btranspose=1
M = A # transpose(B)	M = matrix_multiply(A, B, /btranspose)	blas_gemm, M, B, A, atranspose=1
M = transpose(A) # transpose(B)	M = matrix_multiply(A, B, /atrans, /btrans)	blas_gemm, M, B, A, atrans=1, btrans=1
M = conj(transpose(A)) # B	M = matrix_multiply(conj(A), B, /atranspose)	blas_gemm, M, B, A, btranspose=-1
M = A # conj(transpose(B))	M = matrix_multiply(A, conj(B), /btranspose)	blas_gemm, M, B, A, atranspose=-1
M = conj(transpose(A)) #transpose(B)	M = matrix_multiply(conj(A), B, /atrans, /btrans)	blas_gemm, M, B, A, atrans=1, btrans=-1
M = transpose(A) # conj(transpose(B))	M = matrix_multiply(A, conj(B), /atrans, /btrans)	blas_gemm, M, B, A, atrans=-1, btrans=1
M = conj(transpose(A)) # conj(transpose(B))	M = matrix_multiply(conj(A), conj(B), /atrans, /btrans)	blas_gemm, M, B, A, atrans=-1, btrans=-1

Note: BLAS_GEMM can also be used in place of the ## operator. For example, M = A ## B is equivalent to BLAS_GEMM, M, A, B, and M = A ## TRANSPOSE(B) is equivalent to BLAS_GEMM, M, A, B, /BTRANSPOSE.

Syntax

BLAS_GEMM, C, A, B [, ALPHA=value] [, BETA=value] [, ATRANSPOSE=value] [, BTRANSPOSE=value]

Arguments

C

The array to be updated. C must be of float, double, complex, or dcomplex type. BLAS_GEMM does not change the size and type of C.

A

The first matrix operand in the matrix product. A may be any array that IDL can convert to the type of C. BLAS_GEMM does not change A, but A will be internally converted to the type of C before multiplication.

B

The second matrix operand in the matrix product. B may be any array that IDL can convert to the type of C. BLAS_GEMM does not change B, but A will be internally converted to the type of C before multiplication.

NOTE: Based on the values of ATRANSPOSE and BTRANSPOSE, the arguments A and B must be compatible with each other to perform matrix multiplication, as well as the output argument C.

Keywords

ALPHA

An optional scaling factor to be multiplied with A and B. ALPHA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change ALPHA. If ALPHA is not specified then a default value of 1.0 is used.

ATRANSPOSE

Set this keyword to a positive value to multiply using the transpose of A. Set this keyword to a negative value to multiply using the complex conjugate transpose of A.

Note: If C is of type float or double, then any non-zero value for ATRANSPOSE has the same effect of a normal transpose of A.

BETA

An optional scaling factor to be multiplied with C before adding the matrix product of A and B. BETA may be any scalar that IDL can convert to the type of C. BLAS_GEMM does not change BETA. If BETA is not specified, then a default value of 0.0 is used.

BTRANSPOSE

Set this keyword to a positive value to multiply using the transpose of B. Set this keyword to a negative value to multiply using the complex conjugate transpose of B.

Note: If C is of type float or double, then any non-zero value for BTRANSPOSE has the same effect of a normal transpose of B.

Thread Pool Keywords

This routine is written to make use of IDL’s thread pool, which can increase execution speed on systems with multiple CPUs. The values stored in the !CPU system variable control whether IDL uses the thread pool for a given computation. In addition, you can use the thread pool keywords TPOOL_MAX_ELTS, TPOOL_MIN_ELTS, and TPOOL_NOTHREAD to override the defaults established by !CPU for a single invocation of this routine. See Thread Pool Keywords for details.

Note: The thread pool keywords have no effect on Mac, MKA will always use multiple cores.

Examples

The following examples show how to use the BLAS_GEMM procedure to perform matrix multiplication, with scaling.

Create a multidimensional array:

A = RANDOMU(1, 4, 5)

Print A:

PRINT, A

IDL prints:

0.417022 0.997185 0.720325 0.932557

0.000114381 0.128124 0.302333 0.999041

0.146756 0.236089 0.0923386 0.396581

0.186260 0.387911 0.345561 0.669746

0.396767 0.935539 0.538817 0.846311

Create another multidimensional array:

B = RANDOMU(2, 4, 5)

Print B

PRINT, B

IDL prints:

0.435995 0.185082 0.0259262 0.931541

0.549662 0.947731 0.435322 0.484749

0.420368 0.320536 0.330335 0.154427

0.204649 0.698863 0.619271 0.119951

0.299655 0.485176 0.266827 0.632738

Create an array to store the product of A and B:

C = FLTARR(4, 4)

Print C

PRINT, C

IDL prints:

0.000000 0.000000 0.000000 0.000000

Set C equal to the product of A and B (i.e., C = TRANSPOSE(A) ## B):

BLAS_GEMM, C, A, B, /ATRANSPOSE

Print C:

PRINT, C

IDL prints:

0.400585 0.447005 0.280554 0.684583

0.964161 1.10666 0.649466 1.66597

0.751232 0.952367 0.538557 1.21421

1.51310 2.12521 1.23066 2.03007

Copy C into a temporary variable:

CPrime = C

Add the product of A and B to C (i.e., C += TRANSPOSE(A) ## B):

BLAS_GEMM, C, A, B, /ATRANSPOSE, BETA=1.0

Print C:

PRINT, C

IDL prints:

0.801170 0.894009 0.561108 1.36917

1.92832 2.21332 1.29893 3.33193

1.50246 1.90473 1.07711 2.42841

3.02620 4.25042 2.46132 4.06014

Show that C is now twice CPrime:

PRINT, C / CPrime

IDL prints:

2.00000 2.00000 2.00000 2.00000

Version History

8.8	Introduced

BLAS_GEMM

The # Operator vs. MATRIX_MULTIPLY vs. BLAS_GEMM

Syntax

Arguments

C

A

B

Keywords

ALPHA

ATRANSPOSE

BETA

BTRANSPOSE

Thread Pool Keywords

Examples

Version History

See Also