Skip to content

Commit

Permalink
Release 0.9.4
Browse files Browse the repository at this point in the history
  • Loading branch information
TanviBhavsar committed Mar 7, 2020
2 parents c5d511d + 9d3f213 commit 256c74b
Show file tree
Hide file tree
Showing 188 changed files with 7,097 additions and 3,951 deletions.
29 changes: 28 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -457,6 +457,8 @@ FiloDB can be queried using the [Prometheus Query Language](https://prometheus.i

### FiloDB PromQL Extensions

FiloDB supports computing Z-score queries. It represents the distance between the raw score and the population mean in units of the standard deviation. This can be used for anomaly detection.

Since FiloDB supports multiple schemas, with possibly more than one value column per schema, there needs to be a way to specify the target column to query. This is done using a `::columnName` suffix in the metric name, like this request which pulls out the "min" column:

http_req_timer::min{_ws_="demo", _ns_="foo"}
Expand All @@ -470,6 +472,7 @@ Some special functions exist to aid debugging and for other purposes:
| `_filodb_chunkmeta_all` | (CLI Only) Returns chunk metadata fields for all chunks matching the time range and filter criteria - ID, # rows, start and end time, as well as the number of bytes and type of encoding used for a particular column. |
| `histogram_bucket` | Extract a bucket from a HistogramColumn |
| `histogram_max_quantile` | More accurate `histogram_quantile` when the max is known |
| `hist_to_prom_vectors` | Convert a histogram to a set of equivalent Prometheus-style bucket time series |

Example of debugging chunk metadata using the CLI:

Expand All @@ -483,8 +486,9 @@ One major difference FiloDB has from the Prometheus data model is that FiloDB su

* There is no need to append `_bucket` to the metric name.
* To compute quantiles: `histogram_quantile(0.7, sum(rate(http_req_latency{app="foo"}[5m])))`
* To extract a bucket: `histogram_bucket(100.0, http_req_latency{app="foo"})`
* To extract a bucket: `http_req_latency{app="foo",_bucket_="100.0"}` (`_bucket_` is a special filter that translates to the `histogram_bucket` function for bucket extraction)
* Sum over multiple Histogram time series: `sum(rate(http_req_latency{app="foo"}[5m]))` - you could then compute quantile over the sum.
* To convert a `HistogramColumn` data back to Prometheus-style time series for each bucket, use the `hist_to_prom_vectors` function
- NOTE: Do NOT use `by (le)` when summing `HistogramColumns`. This is not appropriate as the "le" tag is not used. FiloDB knows how to sum multiple histograms together correctly without grouping tricks.
- FiloDB prevents many incorrect histogram aggregations in Prometheus when using `HistogramColumn`, such as handling of multiple histogram schemas across time series and across time.

Expand Down Expand Up @@ -707,6 +711,29 @@ You may also configure CLI logging by copying `cli/src/main/resources/logback.xm

You can also change the logging directory by setting the FILO_LOG_DIR environment variable before calling the CLI.

### Debugging Binary Vectors and Binary Records

Following command can be used to formulate BR for a partKey using Prometheus filter. Use this to formulate CQL to issue a cassandra query

> ./filo-cli --command promFilterToPartKeyBR --promql "myMetricName{_ws_='myWs',_ns_='myNs'}" --schema prom-counter
0x2c0000000f1712000000200000004b8b36940c006d794d65747269634e616d650e00c104006d794e73c004006d795773

Following command can be used to decode partKey BR into tag/value pairs

> ./filo-cli --command partKeyBrAsString --hexPk 0x2C0000000F1712000000200000004B8B36940C006D794D65747269634E616D650E00C104006D794E73C004006D795773
b2[schema=prom-counter _metric_=myMetricName,tags={_ns_: myNs, _ws_: myWs}]

Following command can be used to decode a ChunkSetInfo read from Cassandra

> ./filo-cli --command decodeChunkInfo --hexChunkInfo 0x12e8253a267ea2db060000005046fc896e0100005046fc896e010000
ChunkSetInfo id=-2620393330526787566 numRows=6 startTime=1574272801000 endTime=1574273042000

Following command can be used to decode a Binary Vector. Valid vector types are `d` for double, `i` for integer `l` for long, `h` for histogram and `s` for string

> ./filo-cli --command decodeVector --vectorType d --hexVector 0x1b000000080800000300000000000000010000000700000006080400109836
DoubleLongWrapDataReader$
3.0,5.0,13.0,15.0,13.0,11.0,9.0

## Current Status

| Component | Status |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,11 +67,12 @@ abstract class ClusterSeedDiscovery(val cluster: Cluster,
}
}

object ClusterSeedDiscovery {
object ClusterSeedDiscovery extends StrictLogging {
/** Seed node strategy. Some implementations discover, some simply read them. */
def apply(cluster: Cluster, settings: AkkaBootstrapperSettings): ClusterSeedDiscovery = {
import settings.{seedDiscoveryClass => fqcn}

logger.info(s"Using $fqcn strategy to discover cluster seeds")
cluster.system.dynamicAccess.createInstanceFor[ClusterSeedDiscovery](
fqcn, Seq((cluster.getClass, cluster), (settings.getClass, settings))) match {
case Failure(e) =>
Expand Down
32 changes: 28 additions & 4 deletions build.sbt
Original file line number Diff line number Diff line change
@@ -1,7 +1,31 @@
// NOTE: Most of the build stuff is centralized in project/FiloBuild.scala.
// This is done to enable more flexible sharing of settings amongst multiple build.sbts for different build pipelines
import sbt._
import sbt.Keys._

publishTo := Some(Resolver.file("Unused repo", file("target/unusedrepo")))
publishTo := Some(Resolver.file("Unused repo", file("target/unusedrepo")))

// Global setting across all subprojects
organization in ThisBuild := "org.filodb"
ThisBuild / organization := "org.filodb"
ThisBuild / organizationName := "FiloDB"
ThisBuild / scalaVersion := "2.11.12"
ThisBuild / publishMavenStyle := true
ThisBuild / Test / publishArtifact := false
ThisBuild / IntegrationTest / publishArtifact := false
ThisBuild / licenses += ("Apache-2.0", url("http://choosealicense.com/licenses/apache/"))
ThisBuild / pomIncludeRepository := { x => false }

lazy val memory = Submodules.memory
lazy val core = Submodules.core
lazy val query = Submodules.query
lazy val prometheus = Submodules.prometheus
lazy val coordinator = Submodules.coordinator
lazy val cassandra = Submodules.cassandra
lazy val kafka = Submodules.kafka
lazy val cli = Submodules.cli
lazy val http = Submodules.http
lazy val gateway = Submodules.gateway
lazy val standalone = Submodules.standalone
lazy val bootstrapper = Submodules.bootstrapper
lazy val sparkJobs = Submodules.sparkJobs
lazy val jmh = Submodules.jmh


Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ import filodb.cassandra.metastore.CassandraMetaStore
import filodb.coordinator.StoreFactory
import filodb.core.downsample.DownsampledTimeSeriesStore
import filodb.core.memstore.TimeSeriesMemStore
import filodb.core.store.NullColumnStore
import filodb.core.store.{NullColumnStore, NullMetaStore}

/**
* A StoreFactory for a TimeSeriesMemStore backed by a Cassandra ChunkSink for on-demand recovery/persistence
Expand All @@ -18,15 +18,20 @@ import filodb.core.store.NullColumnStore
* @param ioPool a Monix Scheduler, recommended to be the standard I/O pool, for scheduling asynchronous I/O
*/
class CassandraTSStoreFactory(config: Config, ioPool: Scheduler) extends StoreFactory {
val colStore = new CassandraColumnStore(config, ioPool)(ioPool)
val metaStore = new CassandraMetaStore(config.getConfig("cassandra"))(ioPool)
val cassandraConfig = config.getConfig("cassandra")
val session = FiloSessionProvider.openSession(cassandraConfig)
val colStore = new CassandraColumnStore(config, ioPool, session)(ioPool)
val metaStore = new CassandraMetaStore(cassandraConfig, session)(ioPool)
val memStore = new TimeSeriesMemStore(config, colStore, metaStore)(ioPool)
}

class DownsampledTSStoreFactory(config: Config, ioPool: Scheduler) extends StoreFactory {
val colStore = new CassandraColumnStore(config, ioPool, None, true)(ioPool)
val metaStore = new CassandraMetaStore(config.getConfig("cassandra"))(ioPool)
val memStore = new DownsampledTimeSeriesStore(colStore, metaStore, config)(ioPool)
val cassandraConfig = config.getConfig("cassandra")
val session = FiloSessionProvider.openSession(cassandraConfig)
val rawColStore = new CassandraColumnStore(config, ioPool, session, false)(ioPool)
val downsampleColStore = new CassandraColumnStore(config, ioPool, session, true)(ioPool)
val metaStore = NullMetaStore
val memStore = new DownsampledTimeSeriesStore(downsampleColStore, rawColStore, config)(ioPool)
}

/**
Expand All @@ -39,7 +44,9 @@ class DownsampledTSStoreFactory(config: Config, ioPool: Scheduler) extends Store
* @param ioPool a Monix Scheduler, recommended to be the standard I/O pool, for scheduling asynchronous I/O
*/
class NonPersistentTSStoreFactory(config: Config, ioPool: Scheduler) extends StoreFactory {
val cassandraConfig = config.getConfig("cassandra")
val session = FiloSessionProvider.openSession(cassandraConfig)
val colStore = new NullColumnStore()(ioPool)
val metaStore = new CassandraMetaStore(config.getConfig("cassandra"))(ioPool)
val metaStore = new CassandraMetaStore(cassandraConfig, session)(ioPool)
val memStore = new TimeSeriesMemStore(config, colStore, metaStore)(ioPool)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import monix.eval.Task
import filodb.core._

object FiloCassandraConnector {
val cassRetriesScheduledCount = Kamon.counter("cassandra-retries-scheduled")
val cassRetriesScheduledCount = Kamon.counter("cassandra-retries-scheduled").withoutTags
}

trait FiloCassandraConnector extends StrictLogging {
Expand All @@ -25,9 +25,7 @@ trait FiloCassandraConnector extends StrictLogging {

// Cassandra config with following keys: keyspace, hosts, port, username, password
def config: Config
def sessionProvider: FiloSessionProvider

lazy val session: Session = sessionProvider.session
def session: Session

lazy val baseRetryInterval = config.getDuration("retry-interval").toMillis.millis
lazy val retryIntervalMaxJitter = config.getDuration("retry-interval-max-jitter").toMillis.toInt
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,38 @@ import com.datastax.driver.core._
import com.typesafe.config.Config
import net.ceedubs.ficus.Ficus._

import filodb.core.Instance

trait FiloSessionProvider {
// It is recommended this be implemented via lazy val. Don't want to recreate a session every time.
def session: Session
}

object FiloSessionProvider extends Instance {
/**
* Reads the "session-provider-fqcn" config key, which names a class that implements
* FiloSessionProvider. It must have a public constructor which accepts a Config
* instance. The same config instance passed to this method is passed to the constructor.
*
* Example:
*
* session-provider-fqcn = filodb.cassandra.DefaultFiloSessionProvider
*
*/
def openSession(config: Config): Session = {
val path = "session-provider-fqcn"

val clazz = if (config.hasPath(path)) {
createClass(config.getString(path)).get
} else {
classOf[DefaultFiloSessionProvider]
}

val args = Seq(classOf[Config] -> config)

createInstance[FiloSessionProvider](clazz, args).get.session
}
}

trait BaseCassandraOptions {
def config: Config

Expand Down

0 comments on commit 256c74b

Please sign in to comment.