Skip to content

SymEngine memory model

Shikhar Jaiswal edited this page Oct 30, 2016 · 3 revisions

Few important points Ondřej mentioned here

We'll have to revisit the whole SymEngine memory model, but it is better to do it once we have more real life experience with symengine and good real life benchmarks, since it's a lot of work and we would want to optimize for how people actually want to use SymEngine, not for artificial benchmarks. Things to play with are:

  • removing internal members (e.g. we cache the hash, but that means more memory needs to be allocated for each instance)
  • removing reference counting -- we would always copy things by value and not have to use RCP at all (so you save on reference counting and can remove the refcount member, but copies are more expensive).
  • cheap copying (related to the previous point) if all our classes are trivially copyable, i.e. that you can use memcopy to create a copy of the class, then this might speed up lots of things. Add and Mul use std::map or std::unordered_map, neither of which is trivially copyable (i.e. it uses pointers internally), so we would have to figure out if it is possible to design a hashtable that is trivially copyable.
  • if the size of each instance (of classes like Add, Mul, Sin, Cos, ...) is predictable, then we should manage the memory ourselves by using C++ allocators. For each operation, like expand or diff, you predict (at least roughly) how much memory you need, then you allocate a chunk with one malloc, and then you fill in all the little instances into it sequentially. The idea is simple, but it's quite difficult in practice.
  • another idea is that since we already have our own type codes for each instance and since we are taking stuff out from the core and re-implementing it outside using the approach in #738, instead of using C++ classes, we can switch to just C structs (i.e. not storing pointers to the virtual functions tables in each instance) and implement the virtual functions ourselves. We already have our own implementation of the single dispatch in eval_double_single_dispatch and it is slower than double virtual functions in the visitor pattern (in eval_double), so this might be difficult to do. But the advantage would be an easier memory model that can be manipulated (created, copied, moved) faster.

So there is a lot that can be done in the future.