Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression for Complex Numbers #9053

Closed
nplass opened this issue Nov 3, 2023 · 5 comments
Closed

Regression for Complex Numbers #9053

nplass opened this issue Nov 3, 2023 · 5 comments

Comments

@nplass
Copy link

nplass commented Nov 3, 2023

Support for complex numbers for linear regression

In vibration engineering complex numbers are a big deal, since vibration at a given frequency can be represented by them.
Moreover, doing linear regression is often needed to fit some vibration-system-model against observed excitation and response. This is used especially for field-balancing.

However, statsmodels refuses to work with complex numbers. They get casted/coerced to float numers (real-part) at serveral locations in code, hence resulting in unvalid results.
In mathematics, however, using C (complex) instead of R (real) would not make any difference to all formulas of linear regression; at least long, as a proper metric / abs() is defined. Thus, changing statsmodels to also work with complex numbers would probably be only minor changes.

As a side note: I am well aware, that any complex number could be represented by two real numbers. However, transforming a complex regression problem into a real/float regression problem of double dimension, yields NOT the same solution (why: in short, there are double as much degrees of freedom/parameters for the fitting matrix. In complex fitting of one-dim there are only TWO parameters (real and imag of linear factor), while in transformed 2x2-real-domain it is FOUR parameters (4 real values of a 2x2 matrix). So, complex problems cannot be simply converted to doing regression in real-space.

Finally, I would ask for the feature to open statsmodels to also seamlessly work with standard Python complex numbers.

@josef-pkt
Copy link
Member

duplicate of #3528

It's not easy. The main work is to figure out how all the statistics are defined for the complex number case.
e.g. even simple statistics like variance and root mean squared error might need options (abs or no abs)
See comments around scipy/scipy#16189 (comment)

Internally in statsmodels many models that require nonlinear optimization are defined for complex numbers. This is implemented to support complex step derivatives.
However, functions like abs destroy the behavior that is needed for complex step derivatives, i.e. the math operations for complex numbers are defined differently from what complex valued statistics might need.

I would review and merge a PR for, for example, an OLSComplex model, but I don't have the time to figure this out on my own and go in details through the literature like those referenced in #3528

Do you know other packages that implement regression for complex variables that could be used for unit tests?

@nplass
Copy link
Author

nplass commented Nov 4, 2023

Dear Josef,

ok, reading your post, I now understand why it is not that easy. Well, I need to admit, being only a user of these great modules, so I do not understand to good about the internal problems.

Some reduced, simplyfied model als "OLSComplex" does sound good to me.

However, I do not have ready unittests at hand... As a starting point however: All unittests for real-numbers models should pretty much work, if any dependent and independent variable is multiplied with a fixed complex number.

@nplass
Copy link
Author

nplass commented Nov 20, 2023

I was just suprised to see, that at least the OLS model works fine for simple regression on complex numbers, as long as you don't call results.summary. (Only when "summary" ist called, lots of errors due to complex->real-coercion appear).
For me, this does the trick. Admins/experts: Feel free to close issue, at least as far as I am concerned.
Thanks.

@josef-pkt
Copy link
Member

parameter estimation works, mainly because the linear algebra automatically handles it.

But the other results like standard errors, inferential statistics like p-values and similar are not appropriate.

@josef-pkt
Copy link
Member

josef-pkt commented Feb 9, 2024

I'm closing this as duplicate.
Main open issue remains #3528 and now additionally #9064 for more general statistics for complex valued random variables.
and preliminary PR #9094
(after I did quite a bit of reading.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants