Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Moore-Penrose Inverse #468

Open
Aweptimum opened this issue Mar 23, 2023 · 12 comments
Open

Add Moore-Penrose Inverse #468

Aweptimum opened this issue Mar 23, 2023 · 12 comments

Comments

@Aweptimum
Copy link
Contributor

Adding the generalize pseudo-inverse I think is on the docket somewhere (mentioned in #260)

I've actually implemented it in a fork using the SVD method of construction (pretty easy actually).

Here are the changes I think it would introduce, but I wanted to check if I'm not missing something before submitting a pr:

  1. Add pseudoInverse property + adder/haser/getter to MatrixCatalog
  2. Add pseudoInverse method to NumericMatrix
  3. Add pseudoInverse test in MatrixOperationsTest that re-uses the regular inverse dataset + tests non-square matrices

Mainly, I noticed NumericDiagonalMatrix has a special inverse method and was wondering if I should add the following to it:

class NumericDiagonalMatrix
{
    ...
    public function pseudoInverse(): NumericMatrix
    {
        return $this->inverse()
    }
    ...
}

For posterity.

But I'm probably missing other things as well.

@Aweptimum
Copy link
Contributor Author

Strike that, not currently working for rectangular matrices, but I may have found a bug in the SVD implementation.
I tested using this matrix: wolfram
However, my pseudoInverse failed.

Examining the matrices, this is the output from the MathPHP's SVD decomposition of the above matrix:

// U
[[1, 0, 0]
[0, 1, 0]
[0, 0, 1]];

// V
[[1, 0, 0, 0]
[0, 1, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 1]];

// S:
[[0, 1, 0, 0]
[1, 0, 0, 0]
[0, 0, 1, 0]];

// D:
[0, 0, 1];

But, according to wolfram alpha, it should look like this:

// U:
[[0,0,1],
[0,1,0],
[1,0,0]];

// V:
[[0,1,0,0],
[0,0,1,0],
[1,0,0,0],
[0,0,0,1]];

// S:
[[1,0,0,0],
[0,1,0,0],
[0,0,1,0]];

// D:
[1,1,1];

I haven't delved into SVD in ~5 years, so I might be misunderstanding something. But to me, it seems the current SVD implementation might not work for rectangular matrices.

@markrogoyski
Copy link
Owner

Hi @Aweptimum,

Thanks for your interest in MathPHP and considering adding additional matrix functionality.

For the SVD, if I recall correctly, the only thing that matters is S and it needs to be a rectangular diagonal matrix with the values not increasing. I think there are multiple solutions for the other matrixes and will work out to the right answer as long as S is correct if I'm not mistaken.

I put in the matrix into the SVDTest.php unit test like this:

            [
                [
                    [0, 1, 0, 0],
                    [1, 0, 0, 0],
                    [0, 0, 1, 0],
                ],
                [
                    'S' => [
                        [1, 0, 0],
                        [0, 1, 0],
                        [0, 0, 1],
                    ],
                ],
            ],

And the tests of all the properties did in fact fail on S matching the expected diagonal matrix and the property of it being diagonal.

@Beakerboy Any thoughts or context you can provide here?

For reference, here is this SVD computed in R and SciPy.

R:

> A = rbind(c(0, 1, 0, 0), c(1, 0, 0, 0), c(0, 0, 1, 0))
> A
     [,1] [,2] [,3] [,4]
[1,]    0    1    0    0
[2,]    1    0    0    0
[3,]    0    0    1    0
> 
> svd(A)
$d
[1] 1 1 1

$u
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

$v
     [,1] [,2] [,3]
[1,]    0    1    0
[2,]    1    0    0
[3,]    0    0    1
[4,]    0    0    0

Python

> import scipy.linalg
>
> A = [[0, 1, 0, 0], [1, 0, 0, 0], [0, 0, 1, 0]]
>
> U, s, Vh = scipy.linalg.svd(A)
>
> U
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])
>
> s
array([1., 1., 1.])
>
> Vh
array([[ 0.,  1., -0.,  0.],
       [ 1.,  0., -0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

@Aweptimum
Copy link
Contributor Author

Aweptimum commented Mar 23, 2023

Thanks for the tests, Mark! That's good to know.

I wonder why wolfram's U is flipped compared to R and Python though 🤔

@Beakerboy
Copy link
Contributor

It’s been a long time since I coded that, and I remember having a lot of problems due to the fact that multiple solutions exist. I think one of the big differences is columns can be swapped, so I think I may have sorted them to ensure tests would be predictable. In this case it looks like everything is orthogonal unit vectors, so this might be a wired edge case.

@Aweptimum
Copy link
Contributor Author

Aweptimum commented Mar 24, 2023

I did repeat my pseudoinverse test by changing one element to a 1:

[
    [0, 1, 0, 0],
    [1, 1, 0, 0],
    [0, 0, 1, 0],
];

And it returned the correct result

[
    [-1, 1, 0],
    [1, 0, 0],
    [0, 0, 1],
    [0, 0, 0],
];

Do you guys feel like the SVD implementation should be changed/fixed? I've spent some time trying to understand how the three matrices are calculated. V and U seem pretty straightforward, but I can't find a definitive method for S, so I'm afraid I'm not much help.

If not, should I just go ahead with Moore-Penrose and avoid using orthogonal matrices in the tests?

@Beakerboy
Copy link
Contributor

If SVD is wrong, it has to be fixed. The problem is likely with the eigenvector calculation. The eignevectors are the things that can be in any order and had to be forcibly organized. SVD also imposes a non-negative constraint on the result. The calculation of S is straightforward (assuming it is correct). I’m putting my money on one of U and V being wrong due to the complexity of eigenvector calculations.

@Aweptimum
Copy link
Contributor Author

You seem to be right, but it's hard to tell since R, Python, and Wolfram have different ideas on how to represent their output - none of the V matrices are identical. Seems Python returns the Hermitian of V(?) and R's V matrix is rectangular. U is a 3x3 identity in 3/4 of the cases (Wolfram's is reflected).

I just tried testing it with of the problem matrix using its M*M, which is:

[
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 0],
]

And then asked wolfram for its eigenvectors + values.

v1 = [0, 0, 1, 0]
v2 = [0, 1, 0, 0]
v3 = [1, 0, 0, 0]
v4 = [0, 0, 0, 1]

values are `(l1, l2, l3, l4) = (1, 1, 1, 0)
I then added the matrix and the outputs to the data sets for the eigenvalues + vectors tests.
The eigenvalue test got it right (using the jacobi method), the eigenvectors test did not - it returned an identity matrix instead of this:

[
    [0, 0, 1, 0]
    [0, 1, 0, 0]
    [1, 0, 0, 0]
    [0, 0, 0, 1]
]

So it does seem to be a problem with the eigenvectors method, but that result should match the V matrix wolfram spat out if I'm understanding correctly and it doesn't.

@Beakerboy
Copy link
Contributor

If you take an identity matrix and swap rows or columns…that has a special name right? I wonder if the eigenvector method is swapping columns in one case, but not the other, such that when the result is used to calculate S, we wind up with something not diagonal…this is purely off the top of my head. I don’t have time to ssh into my cloud server and run tests

@Aweptimum
Copy link
Contributor Author

Aweptimum commented Apr 3, 2023

I think it's a Permutation Matrix
If you take a matrix, A and a permutation matrix, P - the product of PA will rearrange the rows of A, while AP will rearrange the columns. Which sounds a bit related.

If you want me to try testing for you, just tell me what to do and I'll try my best.

@Beakerboy
Copy link
Contributor

Beakerboy commented Apr 3, 2023

What I was getting at is when I spent 5 minutes refreshing myself with the code I saw that there were multiple calls to the eigenvector function. These functions swap columns on occasion. I was think if one call included a swapping, and one did not, multiplying them together may produce a permutation matrix instead of an identity matrix. The diagnostic process would be to look at each calculation step and see where the failure occurs. If the eigenvectors are correct, why does multiplying them not produce the expected output? Is it a problem with the eigenvectors, the multiplication, or the general principle that multiplying them is the correct procedure.

@Aweptimum
Copy link
Contributor Author

After messing with the eigenvector + SVD code, I'm thinking the failure is just a result of the ordering of the eigenvectors. In the case of repeated eigenvalues, there is no determinate order for the eigenvectors in U and V.

The eigenvalues of MM are (1,1,1,0), so the eigenvector corresponding to 0 will always be the last column, while the other 3 can kind of go in any order in the V matrix. The same is true of MM for U, which has eigenvalues of (1,1,1). So in MathPHP, U and V turn out to be identity matrices because that's just the order in which the eigenvector solutions are returned. S then has to be non-diagonal to satisfy A = USV.

However, it seems completely legal to shuffle U and/or V such that S may be diagonal, as long as you hold constant the order of the singular values of S and the corresponding columns of U and rows of V. I'm not quite sure how to go about that, but all that's to say - I don't think there's anything really wrong with the eigenvector implementation. I did have to create a "factory" helper (Eigenvalues::eigenvalues($method)) that picks the proper eigenvalue solution method though. The current eigenvector method defaults to solving the eigenvalues using the closedFormPolynomialRootMethod, which did error in my case.

Hopefully I can figure out how to deal with U and V this week.

@Aweptimum
Copy link
Contributor Author

Aweptimum commented Apr 26, 2023

I got an answer!

If S is non-diagonal, it needs to be multiplied by a permutation matrix, P, to make it diagonal. Pre-multiplying (PA) swaps rows, Post-multiplying (AP) swaps columns.

If the permutation swaps the columns of S, then the rows of U need to be swapped, so we post-multiply U with P:
M = (UP-1)(PS)V
Then we have:
U = UP-1, S = PS, and V = V

If the permutation swaps rows of S then the columns of V need to be swapped, so we pre-multiply V with P:
M = U(SP)(P-1V)
Then we have:
U = U, S = SP, V = P-1V

Either one is valid, and it all works out nicely. I'll work on adding some logic in the SVD method to permute S column-wise (so U = UP-1) such that S becomes diagonal and to preserve the descending order of the eigenvalues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants