Skip to content

besquared/quotient-cube

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quotient Cube is an algorithm and strategy for storing and querying data cubes.

This is a pure ruby implementation of that algorithm.

Requires tokyocabinet, autobots-transform

It is currently powering a personal analytics project and I think it is stable
enough (under some assumptions) for test and personal usage. Can also be used
as a tool for learning how this particular algorithm works.

Right now there is no support for incremental updates so everything is done
in batch. Smaller tables say with <10 dimensions can be built relatively 
quickly (a few thousand rows/second) so this is appropriate for daily batches
consisting of several million rows. Once the cube is pre-computed querying
is very fast. Point queries on a small tree can be done at the rate of 4000-5000
per second. While range queries can be performed at rate of about 1000 per second.

Example:

require 'tokyocabinet'
require 'quotient-cube'

# 1) Base Data Table

@table = Table.new(
  :column_names => [
    'store', 'product', 'season', 'sales'
  ], :data => [
    ['S1', 'P1', 's', 6],
    ['S1', 'P2', 's', 12],
    ['S2', 'P1', 'f', 9]
  ]
)

@measures = ['sales[sum]', 'sales[avg]']
@dimensions = ['store', 'product', 'season']

# 2) Build quotient cube data structure

cube = QuotientCube::Base.build(@table, @dimensions, @measures) do |table, pointers|
  sum = 0
  pointers.each do |pointer|
    sum += table[pointer]['sales']
  end

  [sum, sum / pointers.length]
end

# 3) Write out Quotient Cube Tree (QC-Tree) to disk

database = BDB.new
database.open(data_path, BDB::OWRITER | BDB::OCREAT)

Tree::Builder.new(database, cube).build

database.close

# Querying the QC-Tree

database.open
tree = Tree::Base.new(database)

# Point Query

@tree.find('sales[avg]', :conditions => {'product' => 'P1', 'season' => 'f'})
#=> {'product' => 'P1', 'season' => 'f', 'sales[avg]' => 9.0}

# Range Query

@tree.find('sales[avg]', :conditions => {'product' => ['P1', 'P2', 'P3']})
#=> [{'product' => 'P1', 'sales[avg]' => 7.5}, 
     {'product' => 'P2', 'sales[avg]' => 12.0}]

# Range Query

@tree.find(:all, :conditions => {'product' => :all})
#=> [
  {'product' => 'P1', 'sales[avg]' => 7.5, 'sales[sum]' => 15.0}, 
  {'product' => 'P2', 'sales[avg]' => 12.0, 'sales[sum]' => 12.0}
]

About

An implementation of the quotient cube algorithm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages