Skip to content
Gil edited this page Nov 6, 2019 · 12 revisions

Some facts

  • linalg api has saturated and doesn't seem to change a lot these days. This means we mostly need the types of calls that are in there: a number of dot products, linear solves, factorizations
  • Compile time and memory of linalg modules are now a problem
  • Maintenance is a problem as the namespace/macro files are very big (changes are costly in labor)
  • GPU features of linalg are not used in algorithms
  • eigen3 is the only backend
  • autodiff won't work with the current system
  • due to the lack of expression trees at compile or runtime, no heterogeneous computing devices can be exploited

pro eigen

  • simplicity, less code, outsourcing
  • linalg has many issues even though although we tried to write something that lasts. Maybe we shouldn't do this ourselves? This in particular includes GPU/CPU/mix stuff
  • eigen3 is stable here to stay, e.g. as it is a dependency for tensorflow, back then eigen3 was way more niche than now
  • slow and memory intensive compilation of linalg, which can make some computers crash
  • eigen's dsl is more clean than our linalg API, which is quite cumbersome as not oop, epecially when doing non-trivial dot product chaining (need to pass transpose flags as function arguments)
  • A solution for GPU/CPU/etc will probably be built on top of eigen (by somebody else), see e.g. sycl
  • We like compilers, shogun modules are usually fixed, so why not do compile time optimized linear algebra
  • Eigen has minimal autodiff support built in (for scalars), see Gil's patch. We could probably extend this to vector valued expressions
    • Even though it is unsupported we could write our own version of AutoDiffScalar and add more functionality that exists in Stan
    • the problem with StanMath is that it's very slow when resusing the same gradient calculation, like we do here. On the other hand Stan supports a lot of reverse mode AD functionality.

con eigen

  • We had eigen to replace lapack, then we built linalg. The point was not to be dependent on a single lib, but have something that can be easily interchanged (at runtime even). There is a danger that tomorrow a new lib comes about and we have to refactor everything again.
    • The question is whether this is feasible, as we would have to make assumptions on how a future lib will work, and anticipate designs ... very difficult
    • An example of how difficult this anticipation is, is our the GPU stuff...which we never even made work.
    • With unlimited manpower, we could built the thing we want. But we don't have that.
    • compile time increases, but workload would be spread accross each translation unit (as opposed to eigen linalg stuff now)
    • plugins would at least allow to compile only algorithms that are wanted

The optimal solution

  • Expression trees built at runtime that are JIT'ed (tensorflow XLA jit style)

    • allows for easy autodiff
    • should be as fast as compiled (potentially faster)
    • writing this ourselves is nuts, do frameworks for this exist? (something like this)
    • need to refactor all algos
  • Expression trees are built at compile time and we rely on compiler to optimize/distribute

    • should allow for autodiff (that's what eigen's autodiff does)
    • still fast
    • fits shogun more as our models are fixed at compile time -> less heavy refactoring if any
    • easier to implement and integrates well with the Eigen lazy evaluation pattern
Clone this wiki locally