En route to Polynomial
Design decisions and progress while writing the Polynomial
(in chronological order)
Developed during GSoC 2015 under Sumith and Shivam's project.
Inputs received from Ondřej Čertík and Francesco Biscani.
- Planned to implement our own
hashtable
andInteger
in SymEngine so as to have a fast Polynomial module, Piranha has both of it's own and the speed that which we have to match.
- Sumith started work on
UnivariatePolynomial
class, got merged eventually.
Remarks
- Uses
std::map
instead ofstd::unordered_map
(or evenpiranha::hash_set
) which is why it is slow.
- Supports only univariate polynomial, can be deprecated once multivariate is implemented.
-
mul_poly
uses a very naive algorithm so doeseval
andeval_bit
- Uses
- Decided to work on the
poly_mul
already present in rings.cpp (here), try to speed up expand2b. Once we hit the speed in lower level, then wrap into aPolynomial
class.
Todos to reach there were:
- Use
piranha::integer
- Pack exponents of the polynomial instead of using a vector for them.
- Use
- To begin,
expand2b
averaged94.6ms
at start, I'll refer to the benchmark results in my machine so that you get an idea of speedups and slowdowns.
- Before packing we used
vector
of exponents tompz_class
unordered_map
, one more option instead of vector wasvalarray
, but didn't lead anywhere, instead brought slowdown.expand2b
averaged around112ms.
There were very few instances were the syntactic sugar came handy. We are assuming that bottleneck is in memory allocation time, valarray will probably not bring much over vector.
- We used
piranha::integer
instead ofmpz_class
which brought speedup but not at all close to Piranha. Hence we moved from using containerstd::unordered_map
topiranha::hash_set
too
Following were the problems faced and how they were resolved
- As
piranha::hash_set
is syntactically similar tostd::unordered_set
, each element is a pair. Here, we were unable to usestd::pair
because of immutability. Hence,m_pair
was written.
- We noticed that sparsity of
hash_set
was less, this was a real slowdown. Due to the fact ofpower of two
packing of exponents were used, hence the hashing was not good and poor sparsity. Then we decided to use Piranha's packingpiranha::encode
. This uses packing byprimes
. Two advantages; one,negative
exponents can also be packed, two, sparsity improved, hence speed. This is when we decided to hard-depend on Piranha for the Polynomial module.
- Once both
piranha::integer
andpiranha::hash_set
were used the speed came down to23ms
, that is 4X from the initial94ms
. But we were still far from Piranha's13.5ms
.
- After investigating we realised that
DNDEBUG
was not passed as a compiler flag leading toDebug
build of Piranha, once this was fixed, we reached an average of14.3ms
. Yay
- As
- One problem still remained. It was noticed after all of this was done
expand2b
now showed averages of114ms
, a slowdown. So wasexpand2
. Usinggit bisect
this was cornered to include ofpiranha.hpp
header. The problem being this inclusion includedthread_pool
header of Piranha, which declared a global thread variable, which turned off some compiler optimizations. But this is not necessary for us as we use only single threaded functions. This was resolved by amending Piranha, the classthread_pool
was templatized i.e. replacedclass thread_pool: private detail::thread_pool_base<>
withtemplate <typename = void> class thread_pool_: private detail::thread_pool_base<>
.
All the speed reports of benchmarks can be found here.
Good to start the wrapping of Polynomial
class.
Polynomials are like a 3D space --- one axis is dense/sparse, another axis is the list of generators, and another axis are the coefficient type (ZZ, QQ or a polynomial, which by itself has a 3D space of options). Probably one can find an application where each of the choices will be the fastest. However, SymPy and Piranha experience seems to suggest that using a sparse and a non-recursive (i.e. providing the full list of generators and just use ZZ, QQ or EX) polynomial is very fast and should provide a competitive solution for most applications. This motivates our design below.
There is a class Ring<ZZ>
(integers), Ring<QQ>
(rational) and Ring<EX>
(any kind of SymEngine expression) for storing the coefficient type as well as generators. A Polynomial
class then stores something like RCP<const Ring<ZZ>>
inside it to provide the type of coefficients as well as the list of generators. poly_mul
then checks the pointer to the Ring
class to ensure it is the exact same instance, otherwise it raises an exception (i.e. the generators and the coefficient type must be the same, otherwise it raises an exception).
Example 1: a polynomial 2*x*y^2 + y^4
can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, 4): 1}
and a pointer to a Ring<ZZ>(x, y)
(i.e. with generators x
and y
).
Example 2: a polynomial 2/3*x*y^2 + y^4
can be represented as a Polynomial instance with a hashtable {(1, 2): 2/3, (0, 4): 1}
and a pointer to a Ring<QQ>(x, y)
.
Example 3: a polynomial sin(a)*2/3*x*y^2 + y^4
can be represented as a Polynomial instance with a hashtable {(1, 2, 1): 2/3, (0, 4, 0): 1}
and a pointer to a Ring<QQ>(x, y, sin(a))
.
Example 4: a polynomial a*x
can be represented as a Polynomial instance with a hashtable {(1, 1): 1}
and a pointer to a Ring<ZZ>(x, a)
. Another way to represent it is as {(1): a}
with a Ring<EX>(x)
, but that will be slower, because the general expressions EX as coefficients are slower than ZZ.
Example 5: a polynomial 2*x*y^2 + y^(-4)
can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, -4): 1}
and a pointer to a Ring<ZZ>(x, y)
(i.e. with generators x
and y
).
Example 6: a polynomial 2*x*y^(3/2) + y^(-1/4)
can be represented as a Polynomial instance with a hashtable {(1, 3/2): 2, (0, -1/4): 1}
and a pointer to a Ring<ZZ, QQ>(x, y)
. The first template (ZZ
) is the coefficient type, the second template is the exponents type (QQ
).
The exponents are signed integers (i.e. positive or negative) by default. They can be specified with the second template to Ring
to be rational.
Instead of using ZZ
, QQ
, EX
, we can instead use the C++ type itself, for example piranha::integer
instead of ZZ
, piranha::rational
instead of QQ
and SymEngine::Expression
instead of EX.
The ring series is implemented just like in SymPy, e.g.:
rs_mul(const Polynomial &a, const Polynomial &b, Polynomial &c, Generator x)
inside it calls poly_mul
which checks that a
and b
have the same Ring, otherwise it raises an exception.
-
Taylor series are polynomials with positive exponents and the ring
Ring<ZZ>
,Ring<QQ>
,Ring<EX>
(i.e. it depends on the coefficient types, but the exponents are always ZZ, the default). -
Laurent series are polynomials with positive or negative exponents and the ring
Ring<ZZ>
,Ring<QQ>
,Ring<EX>
(i.e. it depends on the coefficient types, but the exponents are always ZZ, the default). -
Puiseux series are polynomials with positive or negative exponents and the ring
Ring<ZZ, QQ>
,Ring<QQ, QQ>
,Ring<EX, QQ>
(i.e. it depends on the coefficient types, but the exponents are always QQ).
While an initial implementation can ignore series precision (the "order term", O) it is always present (infinite if not stated).