Skip to content

Releases: apache/incubator-horaedb

v2.0.0

23 Apr 02:31
Compare
Choose a tag to compare

Upgrade from 1.x.x to 2.0.0

The transition from CeresDB to Apache HoraeDB introduces several breaking changes. To facilitate upgrading from older versions to v2.0.0, specific alterations are necessary.

Upgrade Steps

  1. Setup required envs
export HORAEDB_DEFAULT_CATALOG=ceresdb
  1. Update config

Etcd's root should be configured both in horaedb and horaemeta

For horaedb

[cluster_deployment.etcd_client]
server_addrs = ['127.0.0.1:2379']
root_path = "/rootPath"

For horaemeta

storage-root-path = "/rootPath"
  1. Upgrade horaemeta

Horaedb will throw following errors, which is expected

2024-01-23 14:37:57.726 ERRO [src/cluster/src/cluster_impl.rs:136] Send heartbeat to meta failed, err:Failed to send heartbeat, cluster:defaultCluster, err:status: Unimplemented, message: "unknown service meta_service.MetaRpcService", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
  1. Upgrade horaedb

After all server upgraded, the cluster should be ready for read/write, and old data could be queried like before.

What's Changed

Breaking Changes

Features

Refactor

Fixed

Read more

v1.2.7

10 Oct 06:06
99ae6e0
Compare
Choose a tag to compare

Major Features

Partition Table

  • Support random partition rule #1193
  • Avoid memory allocation during partition write requests #1208
  • Fix wrong text of show create table for partition table #1214
  • Improved partitioned table tests powered by tsbs #1195

Performance

  • Teach ceresdb to run the whole dist query process #1204
  • Support aggr push down in distributed query #1232
  • Store real time range in sst #1225
  • Use real time range to filter memtable #1233
  • Rewrite not in expr to in #1236
  • Dedup requests in proxy #1125
  • Support dedup execute physical plan #1237

Bug Fix

  • Fix deadlock when dedup stream read #1199
  • Fix panic when read data out of range by disk cache #1206
  • Fix lock contention on acquiring the arena stats #1207
  • Fix panic if dedupped query fails #1229

What's Changed

New Contributors

Full Changelog: v1.2.6...v1.2.7

v1.2.6

05 Sep 07:15
040e568
Compare
Choose a tag to compare

Major Features

Query

  • Optimize datafusion plan, remove unnecessary node #1150
  • Optimize disk cache, avoid panic when cache file is corrupted #1130
  • Support PostgreSQL protocol #1138

Remote engine

  • Optimize remote server's protocol, reduce payload overhead when write batch is small #1146

WAL

  • Open wal parallelly #1129
  • Introduce columnar encoding

What's Changed

New Contributors

Full Changelog: v1.2.5...v1.2.6

v1.2.5

08 Aug 11:03
Compare
Choose a tag to compare

Major Features

  • Support OceanBase as object store backend(stable now!)
  • Compaction
    • support compact same table concurrently #1101
    • support pick sst by sequence order, this is required to avoid data corruption for overwrite table #1041
  • Improve the stability of CeresDB
    • ensure shard is opened once #1080
    • add shard status when heartbeat to meta #1082
  • Enhancement on query and write
    • avoid write queue full block #1065
    • avoid prefetching all sst streams at once #1069
    • improve performance of thetasketch distinct #1102
    • query requests dedup #1100
    • string type support dictionary, this will reduce memory consumption by 30% in our experiments #993, #1049, #1068
  • Improve the performance of recovery
    • make obkv wal opening more parallelly #1129

What's Changed

Full Changelog: v1.2.4...v1.2.5

v1.2.4

12 Jul 07:58
Compare
Choose a tag to compare

Major Features

  • Support shard based recovery to improve performance #976

  • Improve the stability of Kafka based wal

    • Some enhancements on Kafka client #980 #1005
    • Refactor wal deletion algorithm #1064
  • Improve performance of query and write

    • Introduce parquet page filter to accelerate query #664
    • Optimize bloom filter building process to accelerate flush #967 #975
  • Support more object store backends

    • Support S3 as object store backend #969
    • Support OceanBase as object store backend(unstable) #887 #970 #971
  • Other new features

    • Support hex literal in sql #1030
    • Improve sst-metadata tool for better debugging #1019

What's Changed

Full Changelog: v1.2.2...v1.2.4

v1.2.2

31 May 09:14
32d3d36
Compare
Choose a tag to compare

Major Features

  • Enhancement on proxy module:
    • #875 Support influxdb api with proxy
    • #886 #896 Refactor sql query with proxy
    • #932 Support querying partitioned sub-table by http query
    • #878 Support forwarding of Prometheus remote write request
  • Improvement of write performance:
    • #879 Support merge small write requests
    • #918 Skip building column schema when columns already exists
  • Enhancement on debugging tools:
    • #909 Support tokio console for debugging
    • #927 Add sst-metadata tool to query sst metadata

Bug Fix

  • Cluster
    • #908 Fix the problem that shard cannot be closed normally
    • #941 Fix the problem that the shard cannot be opened normally due to deadlock
  • Compaction
    • #910 Limit input sst size when compact for old bucket
    • #915 Ensure pick at least 2 files for compaction
  • Proxy
    • #911 Fix auto create table without CeresMeta
  • PromQL
    • #901 Fix reserve column case when build plan

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.2.2

v1.2.0

08 May 06:27
3fadba6
Compare
Choose a tag to compare

Upgrade Guide

NOTE: this guide is only used for upgrading CeresDB v1.1.0 to CeresDB v1.2.0, ignore it if you want to deploy a brand new CeresDB cluster with v1.2.0.

In v1.2.0, some incompatible changes are contained, so it's important to upgrade carefully:

  1. First, stop all the instances of CeresDB and CeresMeta;
  2. Upgrade the CeresMeta first by referring to the Upgrade Guide of CeresMeta;
  3. When upgrade the CeresDB, the config should be updated:
  • Change the config section [analytic.compaction_config] to [analytic.compaction] if you use it;
  • Add the config section about the [cluster_deployment.etcd_client] if your CeresDB cluster is in WithMeta mode:
[cluster_deployment.etcd_client]
server_addrs = ['127.0.0.1:2379']
root_path = "/rootPath"

NOTE: the root_path must be /rootPath if upgrade from v1.1.0.
4. After updating CeresDB config, start the CeresDB server;

Major Features

  • Enhancement on InfluxQL support:
    • Support query with aggregators;
    • #854 optimize influxql planner to load all tables on demand instead of loading them when initializing the planner;
    • Replace influxdb_iox with CeresDB/influxql to remove unnecessary dependencies introduced by influxdb_iox;
  • Enhancement on proxy module:
    • Implement the proxy as a separate module;
    • Support forward table requests in proxy;
    • Support read and write on partition table in proxy;
    • Recover the metadata of partition table from CeresMeta instead of CeresDB in proxy;
  • Improvement of write performance:
    • #822 solves the problem that compaction schedule triggered by flush procedure may block the write procedure;
    • #814 is a big change set, and replaces the write queue with the lock on table level for less write contentions;
    • #843 adjusts the flush strategy to avoid frequent write stall;
    • #861 brings the level 1 to SSTs, and currently the SST of the level 0, which is generated by flushing, won't contain complex indexes, e.g. xor-filter, leading to faster flushing and less write stall;
  • Enhancement on observability:
    • #774 introduces the hotspot recorder that can be used to find out the top tables with the highest write/read throughput in a specific time window;
    • #827 #831 provides more metrics for all the stages of writing procedure, which can be used to troubleshoot write performance problems, and the grafana dashboard config has been already updated.
    • #817 introduces the CPU profiler, and the flamegraph of CPU can be generated easily just by an HTTP request to CeresDB server;
  • Support the new mechanism of failover and load balancing, more details can refer to the [Release Note v1.2.0] of CeresMeta:
    • #706 #853 implements the distributed locks for shard based on ETCD, and opening and closing of shards is only allowed with the shard lock held, and after that, data corruption caused by multiple shard leaders will be avoided completely;
    • Support automatic failover of CeresDB nodes, that is to say, the service recovery can be handled automatically without any manual intervention;
    • Support automatic load balance based on consistent hashing, which can ensure that shards are evenly distributed on each node of the cluster when the number of the cluster nodes increases or decreases;

Thanks

Heartfelt thanks for @zouxiang1993's effort in helping troubleshooting write performance issues.

What's Changed

Read more

v1.1.0

30 Mar 09:30
50fe782
Compare
Choose a tag to compare

Major features

  • Initial support for InfluxQL, usage can be found here.
  • Introduce proxy module:
    • Support auto update schema when a new column occurs in the write request;
    • Support forward stream sql query now, usage can be found here.
  • Optimize SST write/read process with less memory consumption.
  • explain analyze [SQL] statement is able to show the details of scan procedure.

What's Changed

  • chore: push nightly image to ghcr.io by @jiacai2050 in #684
  • fix: nightly docker image name by @jiacai2050 in #687
  • refactor: move grpc create table to sql crates by @jiacai2050 in #689
  • chore: image name must be lowercase by @jiacai2050 in #691
  • feat: add go sdk tests by @jiacai2050 in #686
  • refactor: replace tokio lock with std lock in some sync scenarios by @ShiKaiWi in #694
  • docs: change the description of qr-code by @archerny in #697
  • fix: Parse string without specify time zone to local time stamp by @MachaelLee in #692
  • ci: upgrade the Rust version used in CI by @ShiKaiWi in #700
  • refactor: remove table flush policy by @ShiKaiWi in #704
  • fix: avoid file purge when they are used in queries by @jiacai2050 in #699
  • feat: support rewrite basic raw query in influxql by @Rachelint in #683
  • refactor: remove cluster version by @ShiKaiWi in #669
  • chore: update pull_request_template.md by @hehex9 in #708
  • chore: modify code coverage trigger conditions by @chunshao90 in #709
  • chore: add concrete tag to log when write failed by @jiacai2050 in #707
  • feat: support the simplest influxql raw query by @Rachelint in #710
  • feat: replace native-tls with rustls by @dust1 in #701
  • feat: support auto create table config by @MachaelLee in #713
  • feat: support integration tests for influxql by @ShiKaiWi in #719
  • refactor: remove custom oss impl by @jiacai2050 in #720
  • docs: replace qr-code by @archerny in #725
  • feat: add influxdb write by @jiacai2050 in #723
  • fix: Panicked when OceanBase table client is initialing by @MachaelLee in #728
  • chore: update issue template by @jiacai2050 in #731
  • feat: new crate trace_metric for collecting metrics in read procedure by @ShiKaiWi in #714
  • fix: div zero when compaction by @ShiKaiWi in #734
  • fix: failure to open a single table does not interrupt the shard's opening process by @MachaelLee in #722
  • fix: remove compaction retry when memory limit by @MachaelLee in #739
  • feat: http debug api for config by @ShiKaiWi in #733
  • feat: reuse logical planner in influxdb_iox by @Rachelint in #730
  • fix: start http server after table recovery finished by @jiacai2050 in #741
  • feat: return error while encountering unsupport from in influxql by @Rachelint in #745
  • feat: don't allow create table which failed when open by @jiacai2050 in #743
  • feat: remove replace table level metrics with aggregate metrics by @ShiKaiWi in #740
  • feat: introduce proxy module by @chunshao90 in #732
  • feat: implement route cache by @MachaelLee in #748
  • feat: block all query requests by @MachaelLee in #751
  • chore: modify workflows by @chunshao90 in #744
  • feat: build sst in stream way by @ShiKaiWi in #747
  • feat: auto add column by @chunshao90 in #749
  • fix: move table engine proxy to table_engine crate by @Rachelint in #755
  • chore: upgrade influxql-logical-planner version and modify CI setting by @chunshao90 in #753
  • feat: support split write request to batches for small wal logs by @ShiKaiWi in #754
  • chore: use influxdb line protocol in crates by @jiacai2050 in #757
  • feat: configurable record batches in flight by @ShiKaiWi in #759
  • fix: rename as_bytes to as_byte in ReadableSize (#428) by @zouxiang1993 in #767
  • feat: convert the influxql result using influxdb format by @Rachelint in #758
  • chore: fix insert-license pre-commit by @jiacai2050 in #756
  • chore: replace unfold with async-stream by @chunshao90 in #763
  • feat: make influxql query interface compatible with influxdb1.8 by @Rachelint in #773

New Contributors

Full Changelog: v1.0.0...v1.1.0

v1.0.0

01 Mar 07:52
9e7b3de
Compare
Choose a tag to compare

Major features

  • Support prometheus remote storage protocol
  • Query performance improvement and resource control
    • Replace bloom filter with XOR8 filter
    • Add timeout for query
    • Add route cache in remote engine client in table partition
    • Support locate partition for in sql expression in table partition
  • Internal refactor
    • Refactor sst module for better extensibility
    • Refactor manifest module
  • Refactor grpc storage service
  • Add intergration test for cluster mode
  • Bug fix
    • Fix sql identifier case-sensibility
    • Correct the order of sync meta snapshot and clean logs in wal on kafka
    • Update flushed_sequence_num after compaction
    • Fix varbinary type error

What's Changed

Read more

v1.0.0-alpha02

30 Dec 13:46
1711d67
Compare
Choose a tag to compare

Major features

  • Table Partition
    • Support key partition(MySQL-like syntax).
    • Implement PartitionTable to support query and write for partitioned tables.
  • Improve query performance
    • Add disk based object store cache. (unstable)
    • Implement parallel get_byte_ranges for ObjectStoreReader.
    • Scan row groups in one sst file parallelly.
    • Support bloom filter in hybrid format.
    • Support MergeIterator to pull data concurrently.
  • Support auto query forwarding for grpc.
  • Normalize the case of SQL and make clear that all SQL cases are sensitive.
  • Chore
    • Migrate current harness tests to sqlness. by @dust1
    • Support memory usage limit on background compaction.
    • Make the bloom filter optional in sst meta.
  • Bug fix
    • Fix wrong primary key when define tsid and timestamp key as primary key.
    • Fix wrong path in the result from StoreWithPrefix.
    • Fix lru-weighted-cache` memory leak.
    • Fix some bugs in background compaction.
    • Fix wrong profile output os path for heap profiling.

What's Changed

Full Changelog: v1.0.0-alpha01...v1.0.0-alpha02