Skip to content
Dustin Sallings edited this page Sep 20, 2012 · 4 revisions

cbfs - a place for everything

cbfs is a relatively young project (four days old as of this writing), but it's got some pretty good promise.

"does anyone use couchbase for massive file serving/storage?"

-- someone in irc the next workday after starting this project

And that describes the goal fairly well. We're building your own private S3 on top of couchbase.

The issues list has the canonical state of things, but here's an overview of the current state of the art:

Currently

  • Can store and retrieve arbitrarily large files (only limited by storage node's space availability).
  • Supports range requests (get any arbitrary part of a file).
  • Conditional gets (don't get this if it hasn't changed)
  • Can cluster an arbitrary mix of storage nodes (windows, linux, mac, whatever).
  • Objects stored on any node can be retrieved from any other node.
  • Objects are probabilistically lazily replicated on request (hot objects statistically fully replicate).
  • Adding a node to the system is an instant and trivial operation.
  • There's no limit to the number of nodes that can be in a cluster.
  • Dead nodes just kind of fall off and eventually are automatically purged from the system.
  • There is no master node.
  • But there are tasks that can only be performed by one node at a time and must not be performed too frequently (cooperatively managed by the cluster).
  • A node cannot be in an invalid state (even with TBs of cache).
  • Filesystem corruption is automatically detected and cleaned up.
  • Stats are maintained for both node and individual object activity.
  • Arbitrary user-defined file meta may be stored as JSON objects (Thanks, Aaron)
  • It can serve anything from a little JSON object to an entire web site.
  • Eager synchronous replication (dead node after 201 ≠ data loss)
  • Store an arbitrary number of previous versions of your files for later retrieval.

Usage

Install Couchbase, set up a bucket to hold cbfs data (I call my bucket cbfs).

Get a cbfs binary:

go get github.com/couchbaselabs/cbfs

Run cbfs against your couchbase bucket:

cbfs -bucket=cbfs -couchbase=http://couchbase-server:8091/

There's an optional -nodeID argument to name your node. cbfs will generate node names for you if you don't specify them. You'll probably find that confusing. Don't worry about it too much right now, though. Adding, renaming and removing nodes are all stress-free operations.

Read the protocol docs to learn more about how to use it.

Clone this wiki locally