New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(polys): fix multi factorisation over QQ<a> #26515
Conversation
✅ Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes. Your release notes are in good order. Here is what the release notes will look like:
This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.13. Click here to see the pull request description that was parsed.
Update The release notes on the wiki have been updated. |
6d43c6a
to
6f7f0ec
Compare
The two doctests that are failing are fine. They can just be updated. The output of See the Trager 1976 reference noted in the docstring for explanation of what this function is doing. |
Benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) Significantly changed benchmark results (master vs previous release) | Change | Before [2487dbb5] | After [a73da947] | Ratio | Benchmark (Parameter) |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| - | 68.4±0.6ms | 44.4±0.4ms | 0.65 | integrate.TimeIntegrationRisch02.time_doit(10) |
| - | 68.4±0.5ms | 44.0±0.6ms | 0.64 | integrate.TimeIntegrationRisch02.time_doit_risch(10) |
| + | 18.6±0.1μs | 30.0±0.3μs | 1.61 | integrate.TimeIntegrationRisch03.time_doit(1) |
| - | 5.39±0.06ms | 2.85±0.01ms | 0.53 | logic.LogicSuite.time_load_file |
| - | 74.4±0.4ms | 29.5±0.3ms | 0.4 | polys.TimeGCD_GaussInt.time_op(1, 'dense') |
| - | 26.0±0.1ms | 17.0±0.03ms | 0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr') |
| - | 74.6±0.2ms | 28.9±0.1ms | 0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse') |
| - | 257±2ms | 126±0.5ms | 0.49 | polys.TimeGCD_GaussInt.time_op(2, 'dense') |
| - | 260±3ms | 126±0.9ms | 0.49 | polys.TimeGCD_GaussInt.time_op(2, 'sparse') |
| - | 669±6ms | 373±1ms | 0.56 | polys.TimeGCD_GaussInt.time_op(3, 'dense') |
| - | 668±6ms | 374±2ms | 0.56 | polys.TimeGCD_GaussInt.time_op(3, 'sparse') |
| - | 493±2μs | 287±2μs | 0.58 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense') |
| - | 1.79±0.01ms | 1.05±0ms | 0.59 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense') |
| - | 5.82±0.02ms | 3.09±0.03ms | 0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 451±4μs | 229±4μs | 0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense') |
| - | 1.50±0.02ms | 678±6μs | 0.45 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense') |
| - | 4.91±0.03ms | 1.66±0.01ms | 0.34 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 371±1μs | 213±2μs | 0.58 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 2.43±0.02ms | 1.23±0.01ms | 0.5 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 9.99±0.06ms | 4.33±0.02ms | 0.43 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 361±3μs | 172±1μs | 0.48 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense') |
| - | 2.49±0.01ms | 906±10μs | 0.36 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 9.56±0.02ms | 2.69±0ms | 0.28 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 1.05±0.01ms | 432±3μs | 0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 1.73±0.01ms | 554±40μs | 0.32 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.89±0.03ms | 1.81±0ms | 0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense') |
| - | 8.42±0.04ms | 1.51±0.01ms | 0.18 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse') |
| - | 285±2μs | 66.0±0.5μs | 0.23 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 3.57±0.02ms | 395±3μs | 0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 3.99±0.03ms | 281±2μs | 0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 7.04±0.09ms | 1.28±0.01ms | 0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense') |
| - | 8.60±0.02ms | 845±5μs | 0.1 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse') |
| - | 5.04±0.05ms | 3.04±0.02ms | 0.6 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| - | 12.0±0.03ms | 6.64±0.03ms | 0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense') |
| - | 22.2±0.07ms | 9.11±0.09ms | 0.41 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| - | 5.31±0.01ms | 879±4μs | 0.17 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse') |
| - | 12.6±0.05ms | 7.13±0.03ms | 0.56 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse') |
| - | 102±1ms | 26.1±0.1ms | 0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense') |
| - | 168±0.8ms | 54.3±0.1ms | 0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse') |
| - | 171±0.5μs | 112±0.5μs | 0.65 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense') |
| - | 357±1μs | 220±0.8μs | 0.62 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse') |
| - | 4.24±0.04ms | 849±3μs | 0.2 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense') |
| - | 5.28±0.02ms | 385±1μs | 0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse') |
| - | 19.8±0.1ms | 2.79±0.01ms | 0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense') |
| - | 22.9±0.4ms | 624±2μs | 0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse') |
| - | 479±2μs | 140±1μs | 0.29 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| - | 4.74±0.03ms | 629±1μs | 0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense') |
| - | 5.34±0.02ms | 143±0.3μs | 0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| - | 13.1±0.1ms | 1.30±0ms | 0.1 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense') |
| - | 14.3±0.2ms | 146±0.7μs | 0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| - | 133±0.6μs | 74.5±0.5μs | 0.56 | solve.TimeMatrixOperations.time_rref(3, 0) |
| - | 252±0.7μs | 88.4±0.7μs | 0.35 | solve.TimeMatrixOperations.time_rref(4, 0) |
| - | 24.0±0.05ms | 10.1±0.01ms | 0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys |
| - | 28.1±0.2ms | 15.5±0.06ms | 0.55 | solve.TimeSparseSystem.time_linsolve_Aaug(20) |
| - | 54.2±0.1ms | 25.1±0.07ms | 0.46 | solve.TimeSparseSystem.time_linsolve_Aaug(30) |
| - | 28.1±0.4ms | 15.3±0.02ms | 0.55 | solve.TimeSparseSystem.time_linsolve_Ab(20) |
| - | 54.7±0.2ms | 25.0±0.2ms | 0.46 | solve.TimeSparseSystem.time_linsolve_Ab(30) |
Full benchmark results can be found as artifacts in GitHub Actions |
This is needed in general for multivariate polynomials rather than just returning a single integer shift.
This is ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing, but this looks good so far...
The narrative really helped in understanding all the changes that were made, their motivation, etc.
A couple of edits for the release notes:
"Previously incorrect results were returned in some cases." -->
"Previously, incorrect results were returned in some cases."
"functions are no changed to return a list of" -->
"functions are now changed to return a list of"
Fixed. |
It would be better to use a dedicated algorithm for multivariate | ||
polynomials instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair if you don't have particular candidate algorithms in mind, but if you do have some references, might be good to provide them as a starting point for future developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I had found a good reference describing a simple algorithm for this then I would have implemented it.
The original Yun76 paper did describe a multivariate algorithm as well:
https://dl.acm.org/doi/pdf/10.1145/800205.806320
However that algorithm is more complicated because it depends on the EZGCD algorithm which is not implemented in sympy. I am not sure that it is in general worth implementing EZGCD because I believe it is superseded by other GCD algorithms.
Yun also refers to previous papers that have multivariate algorithms involving Hensel lifting that I assume are also more complicated.
It seems common in textbooks, theses etc to describe Yun's univariate algorithm but not mention the multivariate case.
More recent papers generally often describe the case of multivariate polynomials in positive characteristic e.g.
On square-free factorization of multivariate over a finite field
Laurent Bernardin 1997
Bernardin compares the algorithm for multivariate polynomials in positive characteristic with Yun's multivariate algorithm in characteristic zero. Bernardin's algorithm is claimed to be simpler and have other advantages. It looks a lot like Yun's univariate algorithm but it is for positive characteristic.
I can't seem to find a simple description of the multivariate characteristic zero case. I think possibly the reason for that is because it is "trivial" somehow like maybe Bernardin's algorithm can be trivially adapted to characteristic zero. It doesn't seem trivial to me though...
Bernardin's algorithm requires computing a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to keep thinking about this one, but don't have any good ideas right now.
Anyway, this thread can serve as a good starting point for future work.
This looks good to me. |
Thanks for the review! |
References to other Issues or PRs
Fixes gh-26497
Brief description of what is fixed or changed
There are various things here.
dmp_sqf_p
to check properly that multivariate polynomials are square free over the ground domain. This is also fixesPoly.is_sqf
.dmp_sqf_norm
to use shifts in multiple variables._dmp_sqf_norm_shifts
to try to find efficient multivariate shifts.sqf_norm
functions now return a shift list rather than a single shift integer. This is to compute a shift likef(x1-s1*a, x2-s2*a, ...)
.dmp_shift
andPoly.shift_list
for multivariate polynomial shifts.dmp_norm
toPolyRing
.norm
toPolyElement
.The most important part is the fix for gh-26497 which is that
dmp_ext_factor
and hencefactor
overZZ_I
orQQ_I
gave incorrect results for some multivariate polynomials. The bug is caused bydmp_sqf_norm
choosing a polynomial that is not square-free which is in turn caused bydmp_sqf_p
incorrectly reporting some polynomials as square-free.The implementation of
dmp_sqf_p
on master treats a multivariate polynomial as if it were a univariate polynomial in its leading variable. This means that it only considers a polynomial to be square-free if it has squared factors of positive degree in the leading variable. Hence:This is not the sense of "square-free" that is needed for
sqf_norm
i.e. a polynomial should only be square free if there are no squared factors having positive degree in any variable. Potentially there is some value in having two different notions of square-free but realistically the notion that we generally want is that of being square free in any variable so that is what I have changed it to.Fixing
dmp_sqf_p
means thatdmp_sqf_norm
does not return an incorrect result with a norm that is not square-free. However thendmp_sqf_norm
goes into an infinite loop. This is because it tries to shift the polynomial likef(x,y,z,...) -> f(x-1,y,z,...) -> f(x-2,y,z,...)
. This is sequence of shifts is not guaranteed to result in a square-free norm for a multivariate polynomial. Instead we can usef(x,y,z,...) -> f(x-1,y-1,z-1,...) -> f(x-2,y-2,z-2,...)
. Shifting all variables like this works and guarantees to find a square-free norm but can be very slow for some polynomials and caused some tests to hang.A more efficient sequence of shifts requires that we use different shifts for each variable e.g. we try
f(x,y,z)
,f(x-1,y,z)
,f(x,y-1,z)
and so on before shifting all variables. This usually results in finding a shift such that the norm can be computed just as efficiently as before. The problem though is that this means thats
returned fromdmp_sqf_norm
needs to be a list of integer shifts rather than a single integer.Changing
dmp_sqf_norm
and alsoPoly.sqf_norm
andsqf_norm
to return a list of integer shifts is not backwards compatible. However I doubt that anyone is using these functions for anything. As far as I know their only real purpose is as an internal routine in Trager's algorithm for factorisation (dup_ext_factor/dmp_ext_factor
). I don't see why anyone would be using these functions directly rather than e.g.factor
so I think it should be okay to change the return type.In turn I then added
dmp_shift
andPoly.shift_list
so that the list of shifts can be used directly.I then added various other things including tests, docs, etc as a general cleanup.
Other comments
Release Notes
factor
on an expression with more than one symbol and the complex unitI
.is_sqf
was fixed. Previously some multivariate polynomials were incorrectly reported as being square-free.sqf_norm
function andPoly.sqf_norm
functions are now changed to return a list of integer shifts instead of a single integer shift. This is not a backwards compatible change but is needed because in general in the multivariate case different shifts are needed for each variable.shift_list
method is added toPoly
and other polynomial types for computing shifts of multivariate polynomials.