- Like levelDB, we can write an insertion log to disk and then delete it when we flush. This allows for crash recovery of the stuff in the memtable.
- For comparing/sorting b-tree keys (which is the dimension portion of a row), we're just using simple byte-wise comparison for now. We could do real semantic comparison later if we had any need to do range requests.
- Switching to a column-oriented design will almost certainly provide a huge speedup for our common-case queries. In particular this allows for much better compression and skipping much more data while scanning.
- We can switch to using varints rather than our cornucopia of integer types. This would make our rows non-fixed-size, though. (This makes more sense if we're column-oriented.)
- See if removing all the
:=
(predeclaring everything) in the inner loops helps - Expand the use of slice grouping
- Use slice grouping for all dimension columsn with fewer than N values (say N = 1M)
- Preallocate all the slots
- See if axing the (Row,Metric)Bytes conversions helps (probably not)
- Presize the grouping maps
- Research how GROUP BY queries are implemented in other DBs
- Skip data in the
falseFilterFunc
case