Skip to content

En route to Polynomial

Ralf Stephan edited this page Oct 29, 2015 · 21 revisions

Design decisions and progress while writing the Polynomial
(in chronological order)

Developed during GSoC 2015 under Sumith and Shivam's project.
Inputs received from Ondřej Čertík and Francesco Biscani.

  • Planned to implement our own hashtable and Integer in SymEngine so as to have a fast Polynomial module, Piranha has both of it's own and the speed that which we have to match.
  • Sumith started work on UnivariatePolynomial class, got merged eventually.
    Remarks
    • Uses std::map instead of std::unordered_map (or even piranha::hash_set) which is why it is slow.
    • Supports only univariate polynomial, can be deprecated once multivariate is implemented.
    • mul_poly uses a very naive algorithm so does eval and eval_bit
  • Decided to work on the poly_mul already present in rings.cpp (here), try to speed up expand2b. Once we hit the speed in lower level, then wrap into a Polynomial class.
    Todos to reach there were:
    • Use piranha::integer
    • Pack exponents of the polynomial instead of using a vector for them.
  • To begin, expand2b averaged 94.6ms at start, I'll refer to the benchmark results in my machine so that you get an idea of speedups and slowdowns.
  • Before packing we used vector of exponents to mpz_class unordered_map, one more option instead of vector was valarray, but didn't lead anywhere, instead brought slowdown. expand2b averaged around 112ms. There were very few instances were the syntactic sugar came handy. We are assuming that bottleneck is in memory allocation time, valarray will probably not bring much over vector.
  • We used piranha::integer instead of mpz_class which brought speedup but not at all close to Piranha. Hence we moved from using container std::unordered_map to piranha::hash_set too
    Following were the problems faced and how they were resolved
    • As piranha::hash_set is syntactically similar to std::unordered_set, each element is a pair. Here, we were unable to use std::pair because of immutability. Hence, m_pair was written.
    • We noticed that sparsity of hash_set was less, this was a real slowdown. Due to the fact of power of two packing of exponents were used, hence the hashing was not good and poor sparsity. Then we decided to use Piranha's packing piranha::encode. This uses packing by primes. Two advantages; one, negative exponents can also be packed, two, sparsity improved, hence speed. This is when we decided to hard-depend on Piranha for the Polynomial module.
    • Once both piranha::integer and piranha::hash_set were used the speed came down to 23ms, that is 4X from the initial 94ms. But we were still far from Piranha's 13.5ms.
    • After investigating we realised that DNDEBUG was not passed as a compiler flag leading to Debug build of Piranha, once this was fixed, we reached an average of 14.3ms. Yay
  • One problem still remained. It was noticed after all of this was done expand2b now showed averages of 114ms, a slowdown. So was expand2. Using git bisect this was cornered to include of piranha.hpp header. The problem being this inclusion included thread_pool header of Piranha, which declared a global thread variable, which turned off some compiler optimizations. But this is not necessary for us as we use only single threaded functions. This was resolved by amending Piranha, the class thread_pool was templatized i.e. replaced class thread_pool: private detail::thread_pool_base<> with template <typename = void> class thread_pool_: private detail::thread_pool_base<>.

All the speed reports of benchmarks can be found here.
Good to start the wrapping of Polynomial class.

Design

Polynomials are like a 3D space --- one axis is dense/sparse, another axis is the list of generators, and another axis are the coefficient type (ZZ, QQ or a polynomial, which by itself has a 3D space of options). Probably one can find an application where each of the choices will be the fastest. However, SymPy and Piranha experience seems to suggest that using a sparse and a non-recursive (i.e. providing the full list of generators and just use ZZ, QQ or EX) polynomial is very fast and should provide a competitive solution for most applications. This motivates our design below.

There is a class Ring<ZZ> (integers), Ring<QQ> (rational) and Ring<EX> (any kind of SymEngine expression) for storing the coefficient type as well as generators. A Polynomial class then stores something like RCP<const Ring<ZZ>> inside it to provide the type of coefficients as well as the list of generators. poly_mul then checks the pointer to the Ring class to ensure it is the exact same instance, otherwise it raises an exception (i.e. the generators and the coefficient type must be the same, otherwise it raises an exception).

Example 1: a polynomial 2*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, 4): 1} and a pointer to a Ring<ZZ>(x, y) (i.e. with generators x and y).

Example 2: a polynomial 2/3*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2): 2/3, (0, 4): 1} and a pointer to a Ring<QQ>(x, y).

Example 3: a polynomial sin(a)*2/3*x*y^2 + y^4 can be represented as a Polynomial instance with a hashtable {(1, 2, 1): 2/3, (0, 4, 0): 1} and a pointer to a Ring<QQ>(x, y, sin(a)).

Example 4: a polynomial a*x can be represented as a Polynomial instance with a hashtable {(1, 1): 1} and a pointer to a Ring<ZZ>(x, a). Another way to represent it is as {(1): a} with a Ring<EX>(x), but that will be slower, because the general expressions EX as coefficients are slower than ZZ.

Example 5: a polynomial 2*x*y^2 + y^(-4) can be represented as a Polynomial instance with a hashtable {(1, 2): 2, (0, -4): 1} and a pointer to a Ring<ZZ>(x, y) (i.e. with generators x and y).

Example 6: a polynomial 2*x*y^(3/2) + y^(-1/4) can be represented as a Polynomial instance with a hashtable {(1, 3/2): 2, (0, -1/4): 1} and a pointer to a Ring<ZZ, QQ>(x, y). The first template (ZZ) is the coefficient type, the second template is the exponents type (QQ).

The exponents are signed integers (i.e. positive or negative) by default. They can be specified with the second template to Ring to be rational.

Instead of using ZZ, QQ, EX, we can instead use the C++ type itself, for example piranha::integer instead of ZZ, piranha::rational instead of QQ and SymEngine::Expression instead of EX.

Ring Series

The ring series is implemented just like in SymPy, e.g.:

rs_mul(const Polynomial &a, const Polynomial &b, Polynomial &c, Generator x)

inside it calls poly_mul which checks that a and b have the same Ring, otherwise it raises an exception.

  • Taylor series are polynomials with positive exponents and the ring Ring<ZZ>, Ring<QQ>, Ring<EX> (i.e. it depends on the coefficient types, but the exponents are always ZZ, the default).

  • Laurent series are polynomials with positive or negative exponents and the ring Ring<ZZ>, Ring<QQ>, Ring<EX> (i.e. it depends on the coefficient types, but the exponents are always ZZ, the default).

  • Puiseux series are polynomials with positive or negative exponents and the ring Ring<ZZ, QQ>, Ring<QQ, QQ>, Ring<EX, QQ> (i.e. it depends on the coefficient types, but the exponents are always QQ).

While an initial implementation can ignore series precision (the "order term", O) it is always present (infinite if not stated).