Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change docstring for psd_wrap to be clear about undefined behavior #2362

Open
jhk23 opened this issue Feb 26, 2024 · 5 comments
Open

Change docstring for psd_wrap to be clear about undefined behavior #2362

jhk23 opened this issue Feb 26, 2024 · 5 comments
Labels
documentation help wanted We would appreciate contributions for this issue! A CVXPY member will provide guidance.

Comments

@jhk23
Copy link

jhk23 commented Feb 26, 2024

Maintainer note. Skip to #2362 (comment) to see the conclusion we came to about this GitHub Issue.

Original report below.


Describe the bug
A small optimization problem is solved with 'optimal' status while a constraint for quadratic form of matrix M is violated massively: its value at the optimum is 14 while its upper bound is 4. This is obviously above any reasonable feasibility tolerance. This behaviour is observed with several optimizers: ECOS, SCS and CLARABEL, and others have not been tested.

The matrix M is NOT positive semidefinite, which violates a convexity condition for constraints in convex optimization. I thought that psd_wrap() handles this case by making sure the solution does not imply a negative quadratic form of M. However, it seems to simply suppress the error message 'Your problem does not follow DCP rules'. Importantly, this suppression should not lead to solutions that breach constraints.

To Reproduce
Find the data attached.
data.csv
matrix.csv

###############
### Imports ###
###############
import pandas, cvxpy
from cvxpy.atoms.affine.wraps import psd_wrap


############
### Data ###
############
data = pandas.read_csv('data.csv', index_col=0)
matrix = pandas.read_csv('matrix.csv', index_col=0)
x = data['x'].values
n = len(x)
s = data['s'].values
M = matrix.values


#####################
### Optimization ####
#####################

# CVXPY variables
y_cp = cvxpy.Variable(shape=n)
y_tilde_cp = y_cp - x
total_y_cp = y_cp.T @ s
M_cp = psd_wrap(M) # M is not PSD, but assume it is PSD
M_qf_cp = cvxpy.quad_form(y_tilde_cp, M_cp)
M_qf_scaled_cp = 10_000 * M_qf_cp # rescale so that the violation of upper bound (0.0004) should trigger at any reasonable feastol 

# CVXPY constraints
constraints = [
    cvxpy.sum(y_cp) == 1,
    y_cp >= 0,
    y_cp <= 1,
    M_qf_scaled_cp <= 4
]

# CVXPY problem
p = cvxpy.Problem(
    objective=cvxpy.Maximize(total_y_cp),
    constraints=constraints
)
info = {}
for solver in ['ECOS', 'SCS', 'CLARABEL']:
    _ = p.solve(solver=solver, verbose=False)

    # Solution info
    info[solver] = {
        'Violation of the constraint of QF of M (scaled)': constraints[-1].residual,
        'Quadratic form of M (scaled)': M_qf_scaled_cp.value,
        'Objective value': total_y_cp.value,
        'Soluton status': p.status
    }

print(
    pandas.DataFrame(info)
)

Expected behavior
If an optimal solution is found, violation of any constraint, including the constraint for the quadratic form, must be below feastol. If a solver cannot reach an optimal solution because a matrix is not PSD, the failure message should be rendered, stating that a non-PSD matrix is the reason for the optimization failure.

Output

UserWarning: Forming a nonconvex expression quad_form(x, indefinite).
  warnings.warn("Forming a nonconvex expression quad_form(x, indefinite).")
                                                     ECOS        SCS   CLARABEL
Violation of the constraint of QF of M (scaled)  10.14652  10.164375  10.146525
Quadratic form of M (scaled)                     14.14652  14.164375  14.146525
Objective value                                  9.988587   9.988587   9.988587
Soluton status                                    optimal    optimal    optimal

Version

  • OS: Windows 10
  • CVXPY Version: 1.4.0

Additional context
If I adjust the matrix M slightly to make it PSD, the solver finds the solution that does not violate the constraint for the quadratic form of the original M.

@phschiele
Copy link
Collaborator

Hi @jhk23! Thanks for the write up and for making the example reproducible!

As documented, the psd_wrap() simply

Asserts that a square matrix is PSD.

This might be useful in several cases:

  1. You already know your matrix is PSD (either by construction, e.g., a covariance matrix, or because you checked before). The PSD check can be expensive, so in these cases it may be worth skipping the check via psd_wrap.
  2. You know the solver of your choice can deal with small negative eigenvalues that are greater in magnitude than the CVXPY tolerance.

If you pass a matrix that is in fact not PSD, you get undefined behaviour. The solver might be able to handle small violations, but if the matrix is far from PSD, basically anything goes.

Notice that you event got a warning:

UserWarning: Forming a nonconvex expression quad_form(x, indefinite).
  warnings.warn("Forming a nonconvex expression quad_form(x, indefinite).")

@jhk23
Copy link
Author

jhk23 commented Feb 26, 2024

Hi @phschiele, thank you for a prompt response!

  1. Given that psd_wrap skips the PSD check, I believe the function's docstring is misleading. I think it should state "Skips PSD assertion", do you agree?
  2. Regarding your point 2, are you saying that if solver tolerance is epsilon, and the smallest negative eigenvalue is smaller than the epsilon in absolute terms, a solver should handle the problem?
  3. Could you please advice on the best way to quantify how far a matrix is from being PSD? I am thinking of a minimum eigenvalue or an average of negative eigenvalues, but I do not have big expertise in this topic, so any advice would be appreciated.
  4. It looks like your approach here is "garbage in, garbage out", but I wonder if it makes sense to overwrite solution status when a violation is so huge? Alternatively, warn about potential issues that may arise when using non-PSD matrices. I do not think that the current warning is sufficient, it just states what can already be known to an average user.

Thanks!

@phschiele
Copy link
Collaborator

phschiele commented Feb 26, 2024

Hi @phschiele, thank you for a prompt response!

  1. Given that psd_wrap skips the PSD check, I believe the function's docstring is misleading. I think it should state "Skips PSD assertion", do you agree?

I agree that "asserts" could be interpreted as "this atom asserts that the provided matrix is PSD". We have a keyword argument to quad_form called assume_PSD which I think is clearer.
We could change the docstring to
"Assumes that a square matrix is PSD to skip validations. Passing non-PSD matrices is undefined behaviour."

  1. Regarding your point 2, are you saying that if solver tolerance is epsilon, and the smallest negative eigenvalue is smaller than the epsilon in absolute terms, a solver should handle the problem?

Yes, should, but the larger the violation, the higher the chances that it won't work.
For example, see the description of the behaviour in the OSQP docs.

  1. Could you please advice on the best way to quantify how far a matrix is from being PSD? I am thinking of a minimum eigenvalue or an average of negative eigenvalues, but I do not have big expertise in this topic, so any advice would be appreciated.

This is how we compute the residual in CVXPY:

def residual(self) -> Optional[np.ndarray]:
"""
For each cone, returns:
||(t,X) - proj(t,X)||
with
proj(t,X) = (t,X) if t >= ||x||
0.5*(t/||x|| + 1)(||x||,x) if -||x|| < t < ||x||
0 if t <= -||x||
References:
https://docs.mosek.com/modeling-cookbook/practical.html#distance-to-a-cone
https://math.stackexchange.com/questions/2509986/projection-onto-the-second-order-cone
"""

  1. It looks like your approach here is "garbage in, garbage out", but I wonder if it makes sense to overwrite solution status when a violation is so huge? Alternatively, warn about potential issues that may arise when using non-PSD matrices. I do not think that the current warning is sufficient, it just states what can already be known to an average user.

The atom presents a promise by the user. If it is not upheld, violations may follow from this. So I would say if someone actively decides to skip validations, then it is also implied that the returned solution must be treated more carefully. I do think the suggested docstring makes this more clear.

Thanks!

@jhk23
Copy link
Author

jhk23 commented Feb 26, 2024

Thanks for your answers.

  1. Could you please advice on the best way to quantify how far a matrix is from being PSD? I am thinking of a minimum eigenvalue or an average of negative eigenvalues, but I do not have big expertise in this topic, so any advice would be appreciated.

This is how we compute the residual in CVXPY:

def residual(self) -> Optional[np.ndarray]:
"""
For each cone, returns:
||(t,X) - proj(t,X)||
with
proj(t,X) = (t,X) if t >= ||x||
0.5*(t/||x|| + 1)(||x||,x) if -||x|| < t < ||x||
0 if t <= -||x||
References:
https://docs.mosek.com/modeling-cookbook/practical.html#distance-to-a-cone
https://math.stackexchange.com/questions/2509986/projection-onto-the-second-order-cone
"""

I meant to ask if you know SOTA methods to quantify how far a matrix is from being PSD, so that I can have prior understanding of how likely the convex solver is to fail.

Also, are you aware of any convex optimization solver that requires matrices to be PSD only in the feasible region, not globally?

@rileyjmurray
Copy link
Collaborator

rileyjmurray commented Apr 1, 2024

Action items

We should change the docstring of psd_wrap to

"Assumes that a square matrix is PSD to skip validations. Passing non-PSD matrices is undefined behavior."

Other _wrap functions should probably have similar changes to their docstrings.

Answering remaining questions from @jhk23

The matter of quantifying the distance from a given matrix to the set of PSD matrices is very well understood. In full generality (i.e., independent of the metric you used for your vector space) it could be cast as a convex optimization problem.

In practical situations (when the metric is induced by the spectral norm, Frobenius norm, or other "unitarily invariant" norm) the distance can be computed by finding an eigendecomposition. The following code shows how that can be done.

import cvxpy as cp
import scipy.linalg as la

def split_into_psd_nsd(M : np.ndarray, TOL: float = 1e-14):
    """
    Given a real-symmetric or complex-Hermitian matrix M, compute a positive semidefinite
    matrix M_pos and a negative semidefinite matrix M_neg so that M = M_pos + M_neg.

    If "norm(...)" computes a unitarily invariant norm of a matrix, then the distance from M
    to the set of PSD matrices is dist = norm(M_neg).
    """
    n = M.shape[0]
    if la.norm( M - M.T.conj(), 'fro' ) > TOL * np.sqrt(n):
        raise ValueError('Input matrix must be real-symmetric or Hermitian.')

    eigvals, eigvecs = la.eigh(M)

    pos = eigvals >= 0
    num_pos = np.count_nonzero(pos)
    if num_pos == n:
        return M, np.zeros((n, n))

    neg = eigvals < 0
    num_neg = np.count_nonzero(neg)
    if num_neg == n:
        return np.zeros((n, n)), M
 
    psd_eigvecs = eigvecs[:, pos].reshape((n, num_pos))
    nsd_eigvecs = eigvecs[:, neg].reshape((n, num_neg))

    psd_part = psd_eigvecs @ np.diag(eigvals[pos]) @ psd_eigvecs.T.conj()
    nsd_part = nsd_eigvecs @ np.diag(eigvals[neg]) @ nsd_eigvecs.T.conj()
    
    return psd_part, nsd_part

As for your question:

Also, are you aware of any convex optimization solver that requires matrices to be PSD only in the feasible region, not globally?

The short answer is no. The longer answer is that it depends on what you mean to only be PSD in a feasible region.

The usual definition of being PSD is that a matrix $M$ is real-symmetric (or Hermitian) and satisfies $x^T M x \geq 0 \forall x \in \mathbb{R}^n$. We can extend this definition to talk about a matrix being PSD on a linear subspace $L$ by asking that $x^T M x \geq 0 \forall x \in L$. Letting $P$ denote the orthogonal projector onto $L$, this is just asking if $P^T M P$ is PSD.

However, if you talk about being PSD over sets other than linear subspaces then things get very nasty. For example, suppose we want to talk about being PSD over the nonnegative orthant $S_n = \{ x \in \mathbb{R}^n, ~ x \geq 0\}$. The set $C_n = \{ M \in \mathbb{R}^{n \times n} : M = M^T , ~ x^T M x \geq 0 \forall x \in S_n \}$ is called the cone of copositive matrices of order $n$. It turns out that checking if a matrix belongs to $C_n$ is an NP-hard problem. So there is basically no hope of having a convex optimization algorithm that would only require a matrix to be PSD on the nonnegative orthant.

@rileyjmurray rileyjmurray changed the title Massive constraint violation when a matrix is not PSD Change docstring for psd_wrap function to be clear about undefined behavior Apr 1, 2024
@rileyjmurray rileyjmurray changed the title Change docstring for psd_wrap function to be clear about undefined behavior Change docstring for psd_wrap to be clear about undefined behavior Apr 1, 2024
@rileyjmurray rileyjmurray added documentation help wanted We would appreciate contributions for this issue! A CVXPY member will provide guidance. labels Apr 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation help wanted We would appreciate contributions for this issue! A CVXPY member will provide guidance.
Projects
None yet
Development

No branches or pull requests

3 participants