eigen and linalg

Some facts

linalg api has saturated and doesn't seem to change a lot these days. This means we mostly need the types of calls that are in there: a number of dot products, linear solves, factorizations
Compile time and memory of linalg modules are now a problem
Maintenance is a problem as the namespace/macro files are very big (changes are costly in labor)
GPU features of linalg are not used in algorithms
eigen3 is the only backend
autodiff won't work with the current system
due to the lack of expression trees at compile or runtime, no heterogeneous computing devices can be exploited

simplicity, less code, outsourcing
linalg has many issues even though although we tried to write something that lasts. Maybe we shouldn't do this ourselves? This in particular includes GPU/CPU/mix stuff
eigen3 is stable here to stay, e.g. as it is a dependency for tensorflow, back then eigen3 was way more niche than now
slow and memory intensive compilation of linalg, which can make some computers crash
eigen's dsl is more clean than our linalg API, which is quite cumbersome as not oop, epecially when doing non-trivial dot product chaining (need to pass transpose flags as function arguments)
A solution for GPU/CPU/etc will probably be built on top of eigen (by somebody else), see e.g. sycl
We like compilers, shogun modules are usually fixed, so why not do compile time optimized linear algebra
Eigen has minimal autodiff support built in (for scalars), see Gil's patch. We could probably extend this to vector valued expressions
- Even though it is unsupported we could write our own version of AutoDiffScalar and add more functionality that exists in Stan
- the problem with StanMath is that it's very slow when resusing the same gradient calculation, like we do here. On the other hand Stan supports a lot of reverse mode AD functionality.

eigen has 2 devs, what if they are hit by a bus?
We had eigen to replace lapack, then we built linalg. The point was not to be dependent on a single lib, but have something that can be easily interchanged (at runtime even). There is a danger that tomorrow a new lib comes about and we have to refactor everything again.
- The question is whether this is feasible, as we would have to make assumptions on how a future lib will work, and anticipate designs ... very difficult
- An example of how difficult this anticipation is, is our the GPU stuff...which we never even made work. - not entirely true [Viktor]
- With unlimited manpower, we could built the thing we want. But we don't have that.
- compile time increases, but workload would be spread accross each translation unit (as opposed to eigen linalg stuff now)
- plugins would at least allow to compile only algorithms that are wanted
- SG14 is a proposal going on for linalg 'wrapper' in c++ standards. The API is as usual quite clean, and allows different allocators/executors. Instead of eigen it would be more interesting to start implementing the draft our own (or if there's already one going on fork that).
Question: could there be Shogun users out there depending on linalg? Answer (Heiko) it was never part of a release afaik

~~Can we avoid the shogun-core (after plugins) to be independent of eigen, so that only plugins will include it?~~ Yes, and no need to include it just have the shared lib that contains the linalg functionalities that the LinalgNamespace defines...

Both seem unlikely to happen

Expression trees built at runtime that are JIT'ed (tensorflow XLA jit style)
- allows for easy autodiff
- should be as fast as compiled (potentially faster)
- writing this ourselves is nuts, do frameworks for this exist? (something like this)
- need to refactor all algos
Expression trees are built at compile time and we rely on compiler to optimize/distribute
- should allow for autodiff (that's what eigen's autodiff does), although there are still open questions re re-using expressions in various places of the codebase
- still fast
- fits shogun more as our models are fixed at compile time -> less heavy refactoring if any
- easier to implement and integrates well with the Eigen lazy evaluation pattern