3

I am trying to write an implementation of an algorithm for the simultaneous diagonalization of two matrices (which are assumed to be simultaneously diagonalizable). However, the algorithm does not seem to converge. The algorithm is described in SIAM J. Matrix Anal. Appl. 14, 927 (1993).

Here is the first part of my code to set up a test case:

import numpy as np
import numpy.linalg as lin
from scipy.optimize import minimize

N = 3
# Unitary example matrix
X = np.array([
    [-0.54717736-0.43779416j,  0.26046313+0.11082439j, 0.56151027-0.33692186j],
    [-0.33452046-0.37890784j, -0.40907097-0.70730291j, -0.15344477+0.23100467j],
    [-0.31253864-0.39468687j,  0.05342909+0.49940543j, -0.70062586+0.05835082j]
])
# Generate eigenvalues
LA = np.diag(np.arange(0, N))
LB = np.diag(np.arange(N, 2*N))
# Generate simultaneously diagonalizable matrices
A = X @ LA @ np.conj(X).T
B = X @ LB @ np.conj(X).T

This should generate two 3x3 matrices which are simultaneously diagonalizable, since they are constructed this way via X. The following code block then defines a few helper functions:

def off2(A, B):
    """Defines the distance from the matrices from
    their diagonal form.
    """
    C = np.abs(A) ** 2 + np.abs(B) ** 2
    diag_idx = np.diag_indices(N)
    C[diag_idx] = 0
    return np.sum(C)

def Rijcs(i, j, c, s):
    """Function R(i, j, c, s) from the paper, see
    Eq. (1) therein. Used for plane rotations in
    the plane ij.
    """
    res = np.eye(N, dtype=complex)
    res[i, i] = c
    res[i, j] = -np.conj(s)
    res[j, i] = s
    res[j, j] = np.conj(c)
    return res


def cs(theta, phi):
    """Parametrization for c and s."""
    c = np.cos(theta)
    s = np.exp(1j * phi) * np.sin(theta)
    return c, s

With these definitions, the algorithm can be implemented:

tol = 1e-10

Q = np.eye(N, dtype=complex)

while True:
    off = off2(A, B)
    # Print statement for debugging purposes
    print(off)
    
    # Terminate if the result is converged
    if off <= tol * (lin.norm(A, "fro") + lin.norm(B, "fro")):
        break

    for i in range(N):
        for j in range(i + 1, N):

            def fij(c, s):
                aij = A[i, j]
                aji = A[j, i]
                aii = A[i, i]
                ajj = A[j, j]

                bij = B[i, j]
                bji = B[j, i]
                bii = B[i, i]
                bjj = B[j, j]

                x = np.array(
                    [
                        [np.conj(aij), np.conj(aii - ajj), -np.conj(aji)],
                        [aji,                 (aii - ajj), -aij         ],
                        [np.conj(bij), np.conj(bii - bjj), -np.conj(bji)],
                        [bji,                 (bii - bjj), -bij         ]
                    ]
                )
                y = np.array(
                    [
                        [c ** 2],
                        [c * s],
                        [s ** 2]
                    ]
                )

                return lin.norm(x @ y, 2)

            # 5
            result = minimize(
                lambda x: fij(*cs(x[0], x[1])),
                x0=(0, 0),
                bounds=(
                    (-0.25 * np.pi, 0.25 * np.pi),
                    (-np.pi, np.pi)
                ),
            )
            theta, phi = result['x']
            c, s = cs(theta, phi)

            # 6
            R = Rijcs(i, j, c, s)

            # 7
            Q = Q @ R
            A = np.conj(R).T @ A @ R
            B = np.conj(R).T @ B @ R

As you can observe from the print statement, the "distance" of A and B from diagonal form does not really converge. Instead, the values printed range from 0.5 up to 3 and oscillate up and down. Is there a bug in this code and if so, where exactly is it?

schade96
  • 119
  • 1
  • 3
  • 7

0 Answers0