Skip to content

v4.3.0_CE_BETA

Compare
Choose a tag to compare
@devil-ming devil-ming released this 28 Mar 09:39
· 1070 commits to develop since this release

Version information

Information Description
Release date March 28, 2024
Version V4.3.0_CE_BETA
Commit number 0193a34
Release note The Beta version resolved most of the issues and is becoming more and more stable. However, there may still be some minor issues or errors that need to be addressed in the final stable release, so we recommend that you use this version in a testing environment.

Overview

OceanBase Database V4.3.0 is released to accommodate typical analytical processing (AP) scenarios in addition to transaction processing (TP) and lightweight AP scenarios. It provides a columnar engine based on the log-structured merge-tree (LSM-tree) architecture to implement integrated row- and column-based data storage. Powered by a new vectorized engine that is based on column data format descriptions and a cost model that is based on columnar storage, this version supports efficient processing of wide tables. This greatly improves the query performance in AP scenarios while ensuring the performance in TP business scenarios. Moreover, the new version includes a materialized view feature, which pre-evaluates and stores view query results to enhance real-time query performance. You can use materialized views for quick report generation and data analytics. The kernel of this version supports online DDL and tenant cloning features, optimizes the parallel DML (PDML) and node restart performance, and improves the bypass import efficiency for data of large object (LOB) types. It also supports AWS Simple Storage Service (S3) as the backup and restore media, optimizes system resource usage, and provides features such as index usage monitoring and local import to improve the ease of use of the system. We recommend that you use this version in hybrid load scenarios such as complex analytics, real-time reports, real-time data warehousing, and online transactions.

Key features

Key AP features

  • Columnar engine

    Columnar storage is crucial for AP databases in scenarios involving complex analytics or ad-hoc queries on a large amount of data. A columnar storage differs from a row-based storage in that it physically arranges data in tables based on columns. When data is stored in columnar storage, the engine can scan only the column data required for query evaluation without scanning entire rows in AP scenarios. This reduces the usage of I/O and memory resources and increases the evaluation speed. In addition, columnar storage naturally provides better data compression conditions to achieve a higher compression ratio, thereby reducing the storage space and network bandwidth required.

    However, common columnar engines are implemented generally based on the assumption that the data organized by column is static without massive random updates. In the case of massive random updates, system performance issues are unavoidable. The LSM-Tree architecture of OceanBase Database can resolve this problem by separately processing baseline data and incremental data. Therefore, OceanBase Database V4.3.0 supports the columnar engine based on the current architecture. It implements columnar storage and rowstores on the same OBServer node based on a single set of code and architecture, ensuring both TP and AP query performance.

    The columnar engine is optimized in terms of optimizer, executor, DDL processing, and transaction processing modules to facilitate AP business migration and improve ease of use in the new version. Specifically, a columnar storage-based new cost model and a vectorized engine are introduced, the query pushdown feature is extended and enhanced, and new features such as the Skip Index attribute, new column-based encoding algorithm, and adaptive compactions are provided.

    You can flexibly set a table in your business system as a row-based store table, columnar storage table, or row/column redundant table based on the load type.

  • New vectorized engine

    OceanBase Database has implemented a vectorized engine based on uniform data descriptions in earlier versions, which obviously improves the performance in contrast to non-vectorized engines but is incompetent in deep AP scenarios. OceanBase Database V4.3.0 implements vectorized engine 2.0 that is based on column data format descriptions. This avoids memory use, serialization, and read/write overheads caused by ObDatum maintenance. Based on the column data format descriptions, OceanBase Database V4.3.0 also reimplements more than 10 common operators such as HashJoin, AGGR, HashGroupBy, and Exchange (DTL Shuffle), and about 20 MySQL expressions including relational operation, logical operation, and arithmetic operation expressions. Based on the new vectorized engine, OceanBase Database will implement more operators and expressions in later V4.3.x versions to achieve higher performance in AP scenarios.

  • Materialized views

    The materialized view feature is introduced since OceanBase Database V4.3.0. This feature is a key feature for AP business scenarios. It pre-evaluates and stores the view query results to improve the query performance and simplify the query logic by reducing real-time evaluations. This feature applies to quick report generation and AP scenarios.

    A materialized view stores query result sets to improve the query performance and depends on the data in the base table. When data in the base table changes, the data in the materialized view must be updated accordingly to ensure synchronization. Therefore, this version introduces a materialized view refresh mechanism, which supports two strategies: complete refresh and fast refresh. In a complete refresh of a materialized view, the system re-executes the query statements corresponding to the materialized view and overwrites the original query result sets with new ones. Complete refreshes apply to scenarios with small amounts of data. In a fast refresh, the system needs to process only the data changed since the last refresh. To implement accurate fast refreshes, OceanBase Database provides the materialized view log feature, which is similar to Oracle Materialized View Log (MLOG). The incremental data updates in the base table are logged to ensure that materialized views can be fast refreshed. Fast refreshes apply to business scenarios with large amounts of data and frequent data changes.

Kernel enhancements

  • Enhancement of the row-based cost estimation system

    As the OceanBase Database version evolves, more cost estimation methods are supported for optimizers. For row-based cost estimation by each operator, a variety of algorithms, such as storage-layer cost estimation, statistics cost estimation, dynamic sampling, and default statistics, are supported. However, no clear cost estimation strategies or control measures are available. OceanBase Database V4.3.0 restructures the row-based cost estimation system. Specifically, it prioritizes cost estimation strategies based on scenarios and provides methods such as hints and system variables for manually intervening in the selection of a cost estimation strategy. This version also enhances the predicate selectivity and number of distinct values (NDV) calculation framework to improve the accuracy of cost estimation by optimizers.

  • Enhancement of the statistics feature

    OceanBase Database V4.3.0 improves the statistics feature in terms of functionality, statistics collection performance, compatibility, and ease of use. Specifically, this version restructures the offline statistics collection process to improve the statistics collection efficiency, and optimizes the statistics collection strategies. By default, OceanBase Database of this version automatically collects information about index histograms and uses derived statistics. This version ensures transaction-level consistency of statistics collected online. It is compatible with the DBMS_STATS.COPY_TABLE_STATS procedure of Oracle to copy statistics and compatible with the ANALYZE TABLE statement of MySQL to support more syntaxes. Moreover, this version provides a command to cancel statistics collection, supports statistics collection progress monitoring, and enhances the ease of maintenance. It also supports parallel deletion of statistics.

  • Adaptive cost model

    In earlier versions of OceanBase Database, the cost model uses constant parameters evaluated by internal servers as hardware system statistics. It uses a series of formulas and constant parameters to describe the execution overhead of each operator. In actual business scenarios, different hardware environments can provide different CPU clock frequencies, sequential/random read speeds, and NIC bandwidths. The differences may contribute to cost estimation deviations. Due to the deviations, the optimizer cannot always generate the optimal execution plan in different business environments. This version optimizes the implementation of the cost model. The cost model can use the DBMS_STATS package to collect or set system statistics parameters to adapt to the hardware environment. The DBA_OB_AUX_STATISTICS view is provided to display the system statistics parameters of the current tenant.

  • Fixing of session variables for function indexes

    When a function index is created on a table, a hidden virtual generated column is added to the table and defined as the index key of the function index. The values of the virtual generated column are stored in the index table. The results of some built-in system functions are affected by session variables. The evaluation result of a function varies based on the values of session variables, even if the input arguments are the same. When a function index or generated column is created in this version, the dependent session variables are fixed in the schema of the index column or generated column to improve stability. When values of the index column or generated column are calculated, fixed values are used and are not affected by variable values in the current session. In OceanBase Database V4.3.0, system variables that can be fixed include timezone_info, nls_format, nls_collation, and sql_mode.

  • Online DDL extension in MySQL mode

    OceanBase Database of this version supports online DDL operations for column type changes in more scenarios, including:

    • Conversion of integer types: Online DDL operations, instead of offline DDL operations, are performed to change the data type of a primary key column, index column, generated column, column on which a generated column depends, or column with a UNIQUE or CHECK constraint to an integer type with a larger value range.
    • Conversion of the DECIMAL data type: For columns that support the DECIMAL data type, online DDL operations are performed to increase the precision within any of the [1,9], [10,18], [19,38], and [39,76] ranges without changing the scale.
    • Conversion of the BIT or CHAR data type: For columns that support the BIT or CHAR data type, online DDL operations are performed to increase the width.
    • Conversion of the VARCHAR or VARBINARY data type: For columns that support the VARCHAR or VARBINARY data type, online DDL operations are performed to increase the width.
    • Conversion of the LOB data type: To change the data type of a column that supports LOB data types to a LOB data type with a larger value range, offline DDL operations are performed for columns of the TINYTEXT or TINYBLOB data type, and online DDL operations are performed for columns of other data types.
    • Conversion between the TINYTEXT and VARCHAR data types: For columns that support the TINYTEXT data type, online DDL operations are performed to change the VARCHAR(x) data type to the TINYTEXT data type if x <= 255, and offline DDL operations are performed if otherwise. For columns that support the VARCHAR data type, online DDL operations are performed to change the TINYTEXT data type to the VARCHAR(x) data type if x >= 255, and offline DDL operations are performed if otherwise.
    • Conversion between the TINYBLOB and VARBINARY data types: For columns that support the TINYBLOB data type, online DDL operations are performed to change the VARBINARY(x) data type to the TINYBLOB data type if x <= 255, and offline DDL operations are performed if otherwise. For columns that support the VARBINARY data type, online DDL operations are performed to change the TINYBLOB data type to the VARBINARY(x) data type if x >= 255, and offline DDL operations are performed if otherwise.
  • Globally unique client session IDs

    If OceanBase Database is of a version earlier than V4.3.0 and OceanBase Database Proxy (ODP) is of a version earlier than V4.2.3, the client session ID of ODP is returned if you execute the SHOW PROCESSLIST statement in ODP to query the session ID, and the server session ID is returned if you query the session ID by using an expression such as connection_id or from a system view. One client session ID corresponds to multiple server session IDs, making it difficult to use a unique ID to identify a session on the entire link. As a result, you can be easily confused when you query session information, which causes inconveniences in user session management. This version restructures the client session ID generation and maintenance process. If OceanBase Database is of V4.3.0 or later and ODP is of V4.2.3 or later, when you query a session ID by executing the SHOW PROCESSLIST statement, from the information_schema.PROCESSLIST or GV$OB_PROCESSLIST view, or by using the connection_id, userenv('sid')/userenv('sessionid'), or sys_context('userenv','sid')/sys_context('userenv','sessionid') expression, the client session ID is returned. You can manage client sessions by using the KILL statement in SQL or PL. If OceanBase Database or ODP does not meet the version requirement, the handling method in earlier versions is used.

  • Renovation of the log stream state machine

    In this version, the status of a log stream is subject to the memory status and persistence status. The persistence status indicates the lifecycle of the log stream. After the log stream is restarted upon a server breakdown, the presence status and memory status of the log stream are determined based on the persistence status. The memory status is the running status of the log stream. It indicates the overall status of the log stream and the status of key submodules. Based on the explicit status and status sequence of the log stream, underlying modules can determine which operations of the log stream are safe and whether the log stream has changed from one state to another and then changed back to the original state. The working status and performance of a log stream after it is restarted upon a server breakdown are optimized for backup and restore processes and migration processes. This improves the stability of log stream features and enhances the concurrency control over log streams.

  • Tenant cloning

    OceanBase Database V4.3.0 introduces the tenant cloning feature. You can execute the CREATE TENANT new_tenant_name FROM source_tenant_name WITH RESOURCE_POOL [=] resource_pool_name, UNIT [=] unit_config statement in the sys tenant to clone the specified tenant. The cloned tenant is a standby tenant. You can execute the ALTER SYSTEM ACTIVATE STANDBY TENANT new_tenant_name statement to switch the cloned tenant to the PRIMARY role to provide services. The cloned tenant and original tenant share the physical macroblocks. However, new data changes and resource usage are isolated by tenant. If you want to perform temporary data analysis or other risky operations with high resource consumption on an online tenant, you can clone the tenant and perform analysis or verification on the cloned tenant to avoid affecting the online tenant. You can also clone a tenant for disaster recovery. When an unrecoverable misoperation is performed on the original tenant, you can use the cloned tenant for data rollback.

Compatibility with MySQL

  • utf8mb4_unicode_ci and utf16_unicode_ci collations

    This version supports the utf8mb4_unicode_ci and utf16_unicode_ci collations.

Performance improvements

  • OceanBase zlib_lite_1.0 compression algorithm

    OceanBase Database V4.3.0 introduces the zlib_lite_1.0 data compression algorithm. This compression algorithm is based on Intel In-Memory Analytics Accelerator (Intel® IAA) and makes full use of hardware features to improve the data compression and decompression performance while ensuring compatibility with the conventional zlib library. On a server equipped with Intel IAA, the zlib_lite_1.0 compression algorithm can remarkably improve the compression and decompression performance. When hardware acceleration conditions are not met, the software acceleration method is automatically used to ensure compatibility across platforms.

  • PDML transaction optimization

    This version supports parallel commit and log replay at the transaction layer, and provides partition-level rollback inside transaction participants, which helps significantly improve the DML execution performance in high concurrency scenarios in contrast to earlier V4.x versions.

  • Optimization of I/O usage in loading tablet metadata

    OceanBase Database V4.x supports millions of partitions on a single server and on-demand loading of metadata because the metadata of millions of tablets cannot be all stored in the memory. On-demand loading is supported at the partition and subcategory levels. Metadata in a partition is divided into different subcategories for layered storage. When a background task requires metadata of a deep level, reading the data results in a high I/O overhead. A high I/O overhead is acceptable for a local solid-state disk (SSD) but may compromise the system performance in scenarios where a hard disk drive (HDD) or a cloud disk is used. This version aggregates the frequently accessed metadata for storage, reducing the number of I/O times required for accessing the metadata to 1. This greatly decreases the I/O overhead in the case of no load and prevents the I/O overheads of background tasks from affecting the query performance in the foreground. The process of loading metadata upon an OBServer node restart is also optimized. Specifically, tablet metadata is batch loaded based on macroblocks. This significantly reduces discrete read I/O operations and increases the restart speed by multiple times or even dozens of times.

High availability enhancements

  • Proactive broadcasting/refreshing of tablet locations

    OceanBase Database provides the periodic location cache refreshing mechanism to ensure that the location information of log streams is updated in real time and is consistent. However, tablet location information can only be passively refreshed. Changes in the mappings between tablets and log streams can trigger SQL retries and read/write errors with a certain probability. OceanBase Database V4.3.0 supports proactive broadcasting of tablet locations to reduce SQL retries and read/write errors caused by changes in mappings after transfer. This version also supports proactive refreshing to avoid unrecoverable read/write errors.

  • AWS S3 supported for backup and restore

    OceanBase Database supports Network File System (NFS), Alibaba Cloud Object Storage Service (OSS), and Tencent Cloud Object Storage (COS) as the storage media for the backup and restore feature in earlier versions. OceanBase Database V4.3.0 further supports AWS S3 as the storage media for backup and restore. You can use AWS S3 as the destination for log archiving and data backup, and use the backup data on AWS S3 for physical restore.

  • Memory scaling limitations

    This version improves the stability in memory scaling and avoids out-of-memory (OOM) errors caused by an improper memory_limit value. Two conditions must be met for the memory_limit setting to take effect on an OBServer node: the reserved memory of the sys500 tenant is not less than the occupied memory, and the value of memory_limit is greater than the sum of the value of system_memory and the memory allocated to resource units. If either condition is not met, when you set the memory_limit parameter, no error is returned but the parameter setting does not take effect.

  • Active transaction transfer

    In the log stream design in OceanBase Database V4.x, data is managed in the unit of tablet and logs are managed in the unit of log stream. Tablets are aggregated in a log stream to avoid two-phase commit of transactions in the log stream. To achieve a balance of data and traffic among different log streams, OceanBase Database allows you to flexibly transfer tablets among log streams. However, during the transfer, an active transaction may still be operating data, which may compromise the atomicity, consistency, isolation, and durability (ACID) capability of the transaction. For example, if the data of an active transaction at the source is not fully transferred to the destination, the atomicity of the transaction cannot be ensured. In versions earlier than V4.3.0, OceanBase Database will terminate active transactions during the transfer. This affects normal execution of transactions. To resolve this issue, OceanBase Database V4.3.0 supports the transfer of active transactions. This allows parallel execution of active transactions and avoids transaction rollback or inconsistency caused by the transfer.

Resource usage optimization

  • MINIMAL mode of transaction logs

    This version restructures the MINIMAL mode of transaction logs to optimize the implementation of the MINIMAL feature in earlier versions and improve the stability of the feature. Enabling the MINIMAL mode significantly decreases the volume of clogs generated for UPDATE and DELETE statements, thereby reducing the resource overheads in log storage, archiving, and transmission. This mode applies to private clouds with limited cross-city network bandwidth resources and public clouds with limited bandwidth resources for writing data to the cloud disk. In OceanBase Database V4.3.0 Beta, this feature is disabled by default because OceanBase Migration Service (OMS) has not been specifically adapted. In a scenario where no tool is required for incremental data synchronization, you can set the system variable binlog_row_image to MINIMAL to enable this feature.

  • Memory throttling mechanism

    In OceanBase Database of a version earlier than V4.x, only a few modules require freezes and minor compactions to release the memory, and most of the modules are MemTables. Therefore, a memory limit is set for MemTables and throttling logic is used to ensure that the memory usage smoothly approaches the upper limit, avoiding write stop in the system caused by sudden OOM errors. In OceanBase Database V4.x, more modules, such as the TxData module, require freezes and minor compactions to release the memory. More refined means are provided to control the memory usage of modules. The memory for the TxData and Multi Data Source (MDS) modules is limited. The two modules share the memory with MemTables. When the memory usage reaches the value of Tenant memory × _tx_share_memory_limit_percentage% × writing_throttling_trigger_percentage%, overall throttling is triggered. This version also supports triggering freezes and minor compactions for transaction data tables by time. By default, a freeze is triggered for transaction data tables every 1,800 seconds to reduce the memory usage of the TxData module.

  • Optimization of the space for temporary results of DDL operations

    Many DDL operations store temporary results in materialized structures. Here are two typical scenarios:

    1. In an index creation scenario where data is scanned from the data table and inserted into the index table, the data scanned from the data table needs to be sorted. If the memory is insufficient during the sorting, the current data in the memory will be temporarily stored in materialized structures to release the memory space for subsequent scanning. Then, the data in the materialized structures will be merged and sorted. This practice is particularly effective in the case of limited memory but requires extra disk space.
    2. In a columnar storage bypass import scenario, the system temporarily stores the data to be inserted into column groups in materialized structures, and then reads the data from the materialized structures when inserting data into each column group. These materialized structures can be used in the SORT operator to store intermediate data required for external sorting. When the system inserts data into column groups, it can cache the data to avoid extra overheads caused by repeated table scanning. This practice can prevent repeated scanning from compromising the performance, but increases the disk space occupied by temporary files.

    To resolve these issues, this version optimizes the data flow of DDL operations. Specifically, it eliminates unnecessary redundant structures to simplify the data flow. It also encodes and compresses the temporary results before storing them in the disk. This way, the disk space occupied by temporary results during DDL operations is significantly reduced, facilitating efficient use of storage resources.

Improvement in ease of use

  • Index monitoring

    Indexes are usually created to improve the performance in querying data from a database. The number of indexes created on a data table increases as business scenarios and operation personnel increase over time. Unused indexes will waste the storage space and increase the overheads of DML operations. In this case, constant attention is required to identify and delete useless indexes to reduce the system load. However, it is difficult to manually identify useless indexes. Therefore, OceanBase Database V4.3.0 introduces the index monitoring feature. You can enable this feature for a user tenant and set sampling rules. The index use information that meets the specified rules is recorded in the memory and updated to the internal table every 15 minutes. You can query the DBA_INDEX_USAGE view to verify whether indexes in a table are referenced and delete useless indexes to release the space.

  • RPC security certificate management

    After remote procedure call (RPC) authentication is enabled for a cluster, when a client, such as an arbitration service client, primary or standby database, or OceanBase Change Data Capture (CDC) client, initiates an access request to the cluster, you need to first place the root CA certificate of the client to the deployment directory of each OBServer node in the cluster and then complete related settings. This process is complex. OceanBase Database V4.3.0 supports the internal certificate management feature. You can call the DBMS_TRUSTED_CERTIFICATE_MANAGER system package in the sys tenant to add, delete, and modify root CA certificates trusted by a cluster. You can query the DBA_OB_TRUSTED_ROOT_CERTIFICATE view in the sys tenant for the list of root CA certificates added to the cluster, as well as information about the certificates, such as the expiration time.

  • Parameter resetting

    In earlier versions, if you want to reset a modified parameter to its default value, you must first query the default value of the parameter and then manually set the parameter to the default value, delivering poor ease of use. In OceanBase Database V4.3.0, the ALTER SYSTEM [RESET] parameter_name [SCOPE = {MEMORY | SPFILE | BOTH}] {TENANT [=] 'tenant_name'} syntax is provided for resetting a parameter to its default value. The default value is obtained from the node that executes the statement. You can reset cluster-level parameters and the parameters of a specified user tenant from the sys tenant. You can reset the parameters of only the current tenant from a user tenant. The implementation of the SCOPE option is consistent across different versions of OceanBase Database. For parameters whose modifications take effect statically, the system only stores their default values in the disk but does not update their values in the memory. For parameters whose modifications take effect dynamically, the system updates their values in the memory and stores their default values in the disk.

  • Detailed display of parameter data types

    In OceanBase Database V4.3.0, data types of parameters are displayed in the data_type column of parameter-related views such as [G]V$OB_PARAMETERS and in the return result of the SHOW PARAMETERS statement. For example, the data type of the log_disk_size parameter is CAPACITY, that of rpc_port is INT, and that of devname is STRING.

  • INROW storage threshold for LOBs

    A large object (LOB) less than or equal to 4 KB in size is stored in INROW (in-memory storage) mode. A LOB greater than 4 KB is stored in the LOB auxiliary table. The row-based storage feature of INROW storage provides higher performance than auxiliary table-based storage in some scenarios. Therefore, OceanBase Database V4.3.0 supports dynamic configuration of the LOB storage mode. You can dynamically adjust the INROW storage size as needed provided that the size does not exceed the maximum row size allowed.

  • Local import from the client

    OceanBase Database V4.3.0 provides the local import feature (LOAD DATA LOCAL INFILE statement) for loading data from local files on the client in streaming mode. This way, developers can directly use local files for testing without the need to upload the files to the server or object storage service, improving the working efficiency in scenarios where a small amount of data needs to be imported.

    Notice

    To use this feature, make sure that the following conditions are met:

    1. The version of OceanBase Client (OBClient) is V2.2.4 or later.
    2. The version of ODP is V3.2.4 or later, if ODP is used for connection to OceanBase Database. If you directly connect to an OBServer node, ignore this requirement.
    3. The version of OceanBase Connector/J is V2.4.8 or later, if Java and OceanBase Connector/J are used.

    Note

    • You can directly use a MySQL client or a native MariaDB client of any version.
    • The SECURE_FILE_PRIV variable specifies the privileges for accessing paths on the server. It does not affect the local import feature and therefore does not need to be specified.

Compatibility changes

Product behavioral changes

Change Description
Client session IDs are unique in ODP in earlier versions and globally unique in the cluster since V4.3.0. If OceanBase Database is of a version earlier than V4.3.0 and ODP is of a version earlier than V4.2.3, the client session ID of ODP is returned if you execute the SHOW PROCESSLIST statement in ODP to query the session ID, and the server session ID is returned if you query the session ID by using an expression such as connection_id or from a system view. One client session ID corresponds to multiple server session IDs, making it difficult to use a unique ID to identify a session on the entire link. As a result, you can be easily confused when you query session information, which causes inconveniences in user session management. This version restructures the client session ID generation and maintenance process. If OceanBase Database is of V4.3.0 or later and ODP is of V4.2.3 or later, when you query a session ID by executing the SHOW PROCESSLIST statement, from the information_schema.PROCESSLIST or GV$OB_PROCESSLIST view, or by using the connection_id, userenv('sid')/userenv('sessionid'), or sys_context('userenv','sid')/sys_context('userenv','sessionid') expression, the client session ID is returned. You can manage client sessions by using the KILL statement in SQL or PL.
If OceanBase Database or ODP does not meet the version requirement, the handling method in earlier versions is used.
Limitations are imposed on the memory_limit setting. This version improves the stability in memory scaling and avoids OOM errors caused by an improper memory_limit value. Two conditions must be met for the memory_limit setting to take effect on an OBServer node: the reserved memory of the sys500 tenant is not less than the occupied memory, and the value of memory_limit is greater than the sum of the value of system_memory and the memory allocated to resource units. If either condition is not met, when you set the memory_limit parameter, no error is returned but the parameter setting does not take effect.
The zlib compression algorithm is no longer used for storage. In OceanBase Database V4.2.0, the zlib compression algorithm is no longer supported for new tables but is still supported for existing tables. In OceanBase Database V4.3.0, the storage layer no longer supports the zlib compression algorithm. If the zlib compression algorithm is used before you upgrade your database to V4.3.0, you must change the compression algorithm for data tables or choose not to compress data tables. The zlib compression algorithm is also prohibited for the transmission of clogs and TableAPIs.
Limitations on using archive_lag_target are refined. OceanBase Database V4.3.0 refines the limitations on using the archive_lag_target parameter.
  1. If no archive media is specified, when you modify this parameter, a message is displayed prompting that you are not allowed to change the default value of this parameter because no archive media is specified.
  2. If AWS S3 is specified as the archive media, the minimum value of this parameter is 60 seconds. An error is returned when you attempt to specify a smaller value.
  3. If OSS, NFS, or COS is specified as the archive media, you can set this parameter to any value within the value range.
  4. If OSS, NFS, or COS is specified as the archive media and the current value of the parameter is smaller than 60 seconds, an error is returned when you attempt to change the archive media to AWS S3 by using a statement.
The max_syslog_file_count parameter specifies the total number of system logs of all types. To reduce the risk that the log disk is used up after the end-to-end diagnostic feature is enabled, the max_syslog_file_count parameter specifies the total number of system logs of all types, instead of the number of system logs of each type. In this case, OceanBase Database V4.3.0 evicts log files based on the first in, first out (FIFO) strategy.
Data types of parameters are displayed in the data_type column in the return result of the SHOW PARAMETERS statement. Data types of parameters are displayed in the data_type column in the return result of the SHOW PARAMETERS statement. The default value of the data_type column is changed from NULL to UNKNOWN.
The default values of MAX_IOPS and MIN_IOPS of resource units are changed. In OceanBase Database of a version earlier than V4.3.0, if both MIN_IOPS and MAX_IOPS are not specified, their values are automatically calculated based on the value of MIN_CPU. To be specific, one CPU core corresponds to 10,000 IOPS, namely, MAX_IOPS = MIN_IOPS = MIN_CPU × 10000.
In OceanBase Database V4.3.0, if MIN_IOPS and MAX_IOPS are not specified, the default IOPS is changed to INT64_MAX, which specifies not to limit IOPS resources.

View changes

View Change type Description
DBA_OB_TRUSTED_ROOT_CERTIFICATE New Displays the list of trusted root CA certificates of the cluster, as well as information about the certificates, such as the expiration time. You can query this view in the sys tenant.
CDB/DBA_MVIEW_LOGS New Displays information about materialized view logs. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVIEWS New Displays information about materialized views. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_STATS_SYS_DEFAULTS New Displays the system-level default values of refresh history statistical attributes for materialized views. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_STATS_PARAMS New Displays the refresh statistical attributes of each materialized view. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_RUN_STATS New Displays information about each refresh of materialized views. Each refresh is identified by a refresh ID. The information includes the timing statistics and refresh parameters of each refresh. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_STATS New Displays the basic timing statistics on each refresh of each materialized view. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_CHANGE_STATS New Displays the data changes in the base table involved in each refresh of all materialized views. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_MVREF_STMT_STATS New Displays information about refresh statements. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_INDEX_USAGE New Displays the usage information of indexes. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
DBA_OB_CLONE_PROGRESS New Displays information about ongoing tenant cloning jobs. You can query this view in the sys tenant.
DBA_OB_CLONE_HISTORY New Displays information about completed cloning jobs. You can query this view from the sys tenant.
CDB/DBA_OB_AUX_STATISTICS New Displays the auxiliary statistics of each tenant. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
[G]V$OB_TABLET_COMPACTION_HISTORY Modified The KEPT_SNAPSHOT column is added to show the multi-version retention timestamps. The MERGE_LEVEL column is added to show information about the reuse of macroblocks and microblocks. The width of the COMMENTS column is adjusted.
[G]V$OB_PARAMETERS Modified Data types of parameters are displayed in the DATA_TYPE column. The default value of the DATA_TYPE column is changed from NULL to UNKNOWN.
[G]V$OB_PROCESSLIST Modified The USER_CLIENT_PORT column is added to show the port number of the client.
CDB/DBA_OB_RECOVER_TABLE_JOBS New Displays information about table-level restore jobs. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_OB_RECOVER_TABLE_JOB_HISTORY New Displays the history of table-level restore jobs. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_OB_IMPORT_TABLE_JOBS New Displays information about cross-tenant import jobs. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_OB_IMPORT_TABLE_JOB_HISTORY New Displays the history of cross-tenant import jobs. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_OB_IMPORT_TABLE_TASKS New Displays information about table-level cross-tenant import tasks. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
CDB/DBA_OB_IMPORT_TABLE_TASK_HISTORY New Displays the history of table-level cross-tenant import tasks. You can query the CDB view only in the sys tenant, and the DBA view in all tenants.
[GV$OB_SQL_AUDIT] Modified The PLSQL_EXEC_TIME column is added to show the PL execution time in μs, excluding the SQL execution time.
[G]V$OB_LS_SNAPSHOTS New Displays information about physical log stream snapshots in resource units.

Parameter changes

Parameter/System variable Change type Description
enable_rpc_authentication_bypass New Specifies whether to allow OMS to connect to a cluster without undergoing RPC security authentication when RPC security authentication is enabled for the OBServer node. It is a cluster-level parameter.
default_compress_func Modified The zlib_lite_1.0 value is added, which specifies to use a zlib compression algorithm with higher performance in an environment that supports hardware acceleration. The zlib_1.0 value is deleted. The zlib_1.0 compression algorithm is prohibited for new tables.
large_query_threshold Modified The value range is changed from [1ms, +∞) to [0ms, +∞). The value 0 specifies to disable the large query identification feature.
default_table_store_format New Specifies the default format for a primary table created in a user tenant. It is a tenant-level parameter. The default value is row. If with column group is not specified during table creation, a row-based storage table is created by default. You can change the value to column (indicating a columnar storage table) or compound (indicating a redundant table with both row-based storage and columnar storage) as needed.
server_cpu_quota_min Modified The effective mode is changed from effective upon a restart to effective immediately
server_cpu_quota_max Modified The effective mode is changed from effective upon a restart to effective immediately

Function/PL package changes

Function/PL package Change type Description
ob_transaction_id New Queries the transaction ID of the current session. If the current session is not in an active transaction, 0 is returned. It is a built-in function.
DBMS_TRUSTED_CERTIFICATE_MANAGER New Contains three subprograms ADD_TRUSTED_CERTIFICATE, DELETE_TRUSTED_CERTIFICATE, and UPDATE_TRUSTED_CERTIFICATE, which are respectively used for adding, deleting, and modifying trusted root CA certificates of the cluster. You can call this package in the sys tenant. This package is supported when RPC authentication is enabled.

Syntax changes

  • The DDL syntax for pre-aggregating column indexes in SSTables is added.
  • The syntax for cloning tenants is added.
  • The syntax for resetting parameters is added.
  • The columnar storage and columnar storage indexing syntaxes are added.
  • Syntaxes related to materialized views are added.
  • The syntax for performing partition-level major compactions is added.

Recommended versions of tools

The following table lists the recommended versions of tools for OceanBase Database V4.3.0_CE.

Tool Version Remarks
ODP V4.2.3 -
OceanBase Cloud Platform (OCP) V4.2.2 -
OceanBase Deployer (OBD) V2.7.0 -
OceanBase Developer Center (ODC) V4.2.3 BP1 -
OceanBase CDC V4.3.0 -
OMS V4.2.3 -
OBClient V2.2.3 -
OceanBase Connector/C V2.2.3 -

Upgrade notes

  • To use OceanBase Database V4.3.0, you need to create a cluster. Smooth upgrade from an earlier version, such as V1.x, V2.x, V3.x, V4.1, or V4.2.x, to V4.3.0 Beta is not supported. If you want to upgrade to V4.3.0, deploy an OceanBase cluster of V4.3.0 and then migrate existing data to the cluster by using OBDUMPER & OBLOADER or OMS.
  • Upgrade from V4.3.0 Beta to V4.3.x will be supported later.
  • Upgrade from V4.2.x to V4.3.x will be supported in later versions.

Considerations

  • We recommend that you set the maximum concurrency for bypass import based on the value of Tenant memory × 0.001/2 MB. If the concurrency exceeds this value, the memory for temporary files may be insufficient. When you use bypass import, we recommend that you split large files into multiple smaller ones to improve the import efficiency.
  • OMS does not support the MINIMAL mode for now. When you use OMS for incremental data synchronization, you are not allowed to set binlog_row_image to MINIMAL. The default value of this parameter is FULL.

Acknowledgments

In the release of this version, we want to extend our acknowledgments to:

版本信息

项目 描述
发布日期 2024-03-28
版本号 V4.3.0_CE_BETA
Commit 号 0193a34
RPM 版本号 oceanbase-ce-4.3.0.1-100000242024032211
版本说明 Beta 版本解决了大部分缺陷,并趋于稳定。推荐测试环境使用。

版本定位

OceanBase 重磅推出 V4.3.0 版本,深入典型 AP 场景,不再局限于 TP + 轻量化 AP 的版本定位。基于 LSM-Tree 架构推出列存引擎,实现行存、列存数据存储一体化,同时推出基于 Column 数据格式描述的新版向量化引擎和基于列存的代价模型,支持高效处理大宽表,显著提升 AP 场景查询性能,并兼顾 TP 业务场景。新增物化视图功能,通过预计算存储视图的查询结果提升实时查询性能,支撑快速报表生成和数据分析场景。新版本内核也扩展了 Online DDL、支持了租户克隆等功能,优化 PDML、节点重启性能,提升 LOB 类型旁路导入效率,增加 S3 备份恢复介质支持,优化系统资源使用,并增加索引使用监控、客户端本地导入等功能提升系统易用性。推荐用于复杂分析、实时报表、实时数仓或联机交易等混合负载场景。

关键特性说明

AP 关键特性

  • 列存引擎

    在大规模数据复杂分析或海量数据即席查询场景中,列式存储是 AP 数据库的关键能力之一。列式存储是一种数据文件组织方式,区别于行式存储,它将表中的数据按照列进行物理排列。数据进行列式存储时,分析场景可仅扫描用于查询计算的列数据,避免整行扫描,减少 IO 和内存等资源使用,提升计算速度。另外按列存储也天然具备更好的数据压缩条件,更易获得较高的压缩比,减少存储空间和网络传输带宽。

    不过常见的列存存储引擎在实现上往往假设不会有大量随机更新, 尽量保证列存组织数据是静态的。当真正伴随大量数据随机更新时,也会不可避免的存在系统性能问题。OceanBase LSM-Tree 架构可以将基线数据和增量数据分别处理,正好可以解决这一场景问题。因此 V4.3.0 版本在当前架构基础上继续扩展支持了列存引擎,在一套代码一个架构一个 OBServer 上,实现了列存和行存数据存储一体化,兼顾 TP 和 AP 类查询性能。

    为了方便 AP 业务迁移、方便老客户顺畅使用新版本,围绕列存引擎,从优化器到执行器、从 DDL 到事务处理等多模块都进行了适配优化。包括基于列存的新的代价模型和向量化引擎,查询下压功能的扩展和增强,Skip Index,新的列式编码算法,自适应 Compaction 等。

    业务上,用户可根据负载类型灵活设置表为行存表、列存表或行列冗余表。

  • 新向量化引擎

    OceanBase 在早期版本已经实现了基于 Uniform 数据描述方式的向量化引擎,性能较非向量化引擎有了明显提升,但在深度 AP 场景,还有一些性能上的不足。V4.3.0 版本实现了向量化引擎 2.0 版本,更改为 Column 数据格式描述,避免了 ObDatum 维护带来的内存使用、序列化和读写访问开销。基于数据格式描述重构,新版本也对一批常用算子和表达式进行了重新实现,如 HashJoin、AGGR、HashGroupBy、Exchange(DTL Shuffle) 等 10 余项算子,关系运算、逻辑运算、算数运算等 20 余项 MySQL 表达式。在后续的 V4.3.x 版本也会基于新的向量化引擎,持续补充完善其他算子和表达式的实现,以便获得 AP 场景更优性能。

  • 物化视图

    V4.3.0 版本新增物化视图(Materialized View)功能。物化视图是支撑 AP 业务的一个关键特性,它通过预计算和存储视图的查询结果,减少实时计算来提升查询性能,简化复杂查询逻辑,常用于快速报表生成和数据分析场景。

    因为物化视图需要存储查询结果集来优化查询性能,而物化视图与基础表之间存在数据依赖关系,每当基础表数据发生变动时,物化视图中的数据必须进行相应更新以保持同步,所以新版本也引入了物化视图刷新机制,包括全量刷新和增量刷新两种策略。全量刷新是一种较为直接的方式,每次执行刷新操作时,系统会重新执行物化视图对应的查询语句,完整地计算并覆盖原有的视图结果数据,这种方式适用于数据量相对较小的场景。相对来讲,增量刷新仅需处理自上次刷新以来发生变更的部分。为了实现精确的增量刷新,OceanBase 实现了类似 Oracle MLOG(Materialized View Log)的物化视图日志功能,通过日志详细跟踪记录基础表的增量更新数据,从而确保物化视图能够进行快速增量刷新。增量刷新方式尤其适用于数据量庞大且变更频繁的业务场景。

内核增强

  • 估行系统增强

    随着 OceanBase 版本的不断演进,优化器代价估计的方式也越来越丰富。涉及到每个算子估行,目前已经支持了存储层估行、统计信息估行、动态采样、默认统计信息等算法,使用上缺少清晰的策略和完善的控制手段。V4.3.0 版本重构估行系统,根据不同场景,指定不同的估行策略优先顺序,并提供 HINT、系统变量等手工干预估行策略选择的方式。同时新版本对谓词选择率和 NDV 计算框架也做了进一步增强,以此提高优化器代价估计的准确性。

  • 统计信息增强

    V4.3.0 版本进一步完善了统计信息功能、改善了统计信息收集性能、提升了统计信息兼容性和易用性。主要包括重构离线统计信息收集流程,提升统计信息收集效率;优化统计信息收集策略,默认自动收集索引直方图,使用推导统计信息收集方式;保证在线统计信息收集的事务一致性;兼容 Oracle 数据库 DBMS_STATS.COPY_TABLE_STATS 功能,用于统计信息拷贝场景;兼容 MySQL 数据库 ANALYZE TABLE 功能,提供更丰富的语法支持;新增取消统计信息收集的命令,丰富统计信息收集进度监控,增强运维易用性;提供统计信息并行删除能力等。

  • 自适应代价模型

    OceanBase 历史版本代价模型是使用内部机器测算的常量参数来代表硬件系统统计信息,通过一系列公式与常量参数来描述每个算子的执行开销。而真实的业务场景中,不同硬件环境可能具备不同的 CPU 时钟频率、不同的顺序读或随机读的速度、不同的网卡带宽等,可能存在代价估算偏差,这些偏差会使得优化器无法在不同的业务环境总是生成最优计划。新版本优化代价模型实现,支持通过 DBMS_STATS 包来收集或设置系统统计信息系数,已达到代价模型自适应硬件的目的。同时也提供了 DBA_OB_AUX_STATISTICS 视图,用于展示当前租户的系统统计信息系数。

  • 函数索引 SESSION 变量固化

    创建函数索引时,会向主表添加一个隐藏的虚拟生成列,定义为函数索引的索引键,然后再将虚拟生成列的值存储到索引表中。一些内置系统函数的结果会受到 SESSION 变量的影响,不同 SESSION 变量取值,即使函数入参相同,计算结果也是不同的。为提高生成列/函数索引的稳定性,新版本在创建函数索引和生成列时,会将依赖的 SESSION 变量固化至索引列和生成列的 Column Schema 中。后续计算索引列和生成列的取值时,使用固化值,不受当前 SESSION 下变量取值的影响。V4.3.0 版本支持固化的系统变量包含 timezone_infonls_formatnls_collationsql_mode 等。

  • MySQL 模式下 Online DDL 扩充

    V4.3.0 版本对列类型变更一类的 DDL 拓展了更多 Online 场景,包括:

    • 整型转换:对于主键列/索引列/生成列/被生成列依赖的列/有 UNIQUECHECK 约束的列,列类型修改为取值范围更大的整型的场景,从 Offline DDL 变更为 Online DDL。
    • DECIMAL 类型转化:对于支持使用 DECIMAL 数据类型的列,Precision 在以下某个区间([1,9]、[10,18]、[19,38]、[39,76])内增长,且 Scale 不变的场景,全部为 Online DDL。
    • BITCHAR 类型转换:对于支持使用 BITCHAR 数据类型的列,宽度增长均为 Online DDL。
    • VARCHARVARBINARY 类型转换:对于支持使用 VARCHARVARBINARY 数据类型的列,宽度增长均为 Online DDL。
    • LOB 类型转换:对于支持使用 LOB 数据类型的列,除了 TINYTEXTTINYBLOB 向上增长是 Offline DDL 流程,其他类型修改为取值范围更大的 LOB 类型时,均为 Online DDL 流程。
    • TINYTEXTVARCHAR 类型转换:对于支持使用 TINYTEXT 数据类型的列,在 VARCHAR(x) 转换为 TINYTEXT 类型时,如果 x <= 255,为 Online DDL,否则为 Offline DDL;对于支持使用 VARCHAR 数据类型的列,在 TINYTEXT 转换为 VARCHAR(x) 类型时,如果 x >= 255,为 Online DDL,否则为 Offline DDL。
    • TINYBLOBVARBINARY 类型转换:对于支持使用 TINYBLOB 数据类型的列,在 VARBINARY(x) 转换为 TINYBLOB 类型时,如果 x <= 255,为 Online DDL,否则为 Offline DDL;对于支持使用 VARBINARY 数据类型的列,在 TINYBLOB 转换为 VARBINARY(x) 类型时,如果 x >= 255,为 Online DDL,否则为 Offline DDL。
  • 全局唯一 Client Session ID

    在 OBServer V4.3.0 版本和 OceanBase 数据库代理(OceanBase Database Proxy,ODP)V4.2.3 版本之前,Client 端通过 ODP 执行 SHOW PROCESSLIST 时,查询到的是 ODP 本机上的 Client 端会话 ID,通过 connection_id 等表达式或系统视图查询到的为 Server 端会话 ID。Client 端会话 ID 和 Server 端会话 ID 存在一对多的关系,难以在全链路统一,Session 信息查看容易混淆,不方便进行用户会话管理。新版本重构 Client Session ID 生成和维护流程,在 OBServer 版本不低于 V4.3.0 且 ODP 版本不低于 V4.2.3 时,通过各种渠道如 SHOW PROCESSLIST 命令、information_schema.PROCESSLISTGV$OB_PROCESSLIST 等视图、connection_iduserenv('sid')/userenv('sessionid')sys_context('userenv','sid')/sys_context('userenv','sessionid') 等表达式,查询到的会话 ID 均为 Client 端会话 ID,用户可通过指定 SQL 或 PL 中的 KILL 命令来管理用户端会话。OBServer 或 ODP 版本不满足要求时,将退化为老版本的处理方式。

  • 日志流状态机改造

    新版本将日志流的状态拆分成了内存和持久化状态。持久化状态用于表示日志流的生命周期,宕机重启以后会根据持久化状态决定日志流是否应该存在,以及内存状态应该是什么。内存状态是日志流运行时状态,用于表示日志流整体状态和关键子模块状态。根据日志流明确的状态和状态 sequence,底层模块可判断日志流哪些操作是安全的,是否发生过 ABA 类型的状态变更。对于备份恢复或迁移流程,在日志流宕机重启后,也优化了工作状态表现。该特性提升了日志流相关功能的稳定性,强化了日志流并发控制能力。

  • 租户克隆

    V4.3.0 版本新增租户克隆特性,用户可在 SYS 租户下执行 CREATE TENANT new_tenant_name FROM source_tenant_name WITH RESOURCE_POOL [=] resource_pool_name, UNIT [=] unit_config 命令,对指定租户快速克隆出一个新租户。租户克隆任务执行完成后,新克隆租户为备租户,用户可使用 ALTER SYSTEM ACTIVATE STANDBY TENANT new_tenant_name 将其转为主租户提供服务。新克隆租户和原始租户在初始状态下共享物理宏块,但新的数据变动和资源使用会按租户进行隔离。当用户需要对在线租户进行高资源消耗的临时数据分析或其他高风险操作时,为了避免对在线租户造成影响,可使用克隆租户完成分析或验证。同时也可将克隆租户作为容灾手段,若原始租户发生了难以恢复的误操作,可使用克隆租户进行数据回滚。

MySQL 兼容

  • utf8mb4/utf16_unicode_ci 字符序

    社区版本新增支持 utf8mb4_unicode_ciutf16_unicode_ci 字符序。

性能提升

  • OceanBase zlib_lite_1.0 压缩算法

    OceanBase V4.3.0 版本引入了一种新的数据压缩算法:zlib_lite_1.0。该压缩算法基于 Intel 的 In-Memory Analytics Accelerator (Intel® IAA) 加速器 ,充分利用硬件功能提升数据库压缩和解压缩的性能,并且保证了对传统 zlib 库的兼容性。在具备 Intel IAA 硬件加速条件的机器上,zlib_lite_1.0 数据压缩算法可以显著提升压缩和解压缩性能;不具备硬件加速条件时,将自动回退到软件加速方式,确保跨平台的兼容性。

  • PDML 事务优化

    新版本在事务层通过支持并行提交和回放日志、增加事务参与者内部分区级别回滚等实现优化,相对 V4.x 早期版本,明显提升高并发场景下 DML 执行性能。

  • Tablet 元数据加载 IO 使用优化

    OceanBase V4.x 版本支持了单机百万分区,因百万 Tablet 元数据可能无法全部放入内存之中,又支持了元数据按需加载。按需加载包括分区和分区内部子类两种级别,分区内部会拆成多种不同子类的元数据分层存储,在后台任务需要较深层次元数据的场景下,读取需要耗费较多的 IO。在本地 SSD 磁盘上,这些 IO 开销不是问题,但在机械磁盘和云盘的场景,可能影响系统性能。新版本针对需要频繁访问的元数据做了聚合存储,这类元数据访问降低为只需要 1 次 IO,大大减小空载情况下 IO 开销,不会因后台任务的 IO 开销影响到前台查询性能。同时也优化了 OBServer 重启时元数据加载流程,修改为按宏块粒度批量加载 Tablet 元数据,大大减少离散读 IO,数倍至数十倍提升重启速度。

高可用增强

  • Tablet Location 主动广播/刷新

    OceanBase 早期版本已具备 Location Cache 周期性刷新机制,确保日志流 Location 信息更新的实时性和最终一致性。但 Tablet Location 只具备被动刷新能力,在 Tablet 和日志流的映射关系发生变化时,有概率触发 SQL 重试和读写报错。V4.3.0 版本增加 Tablet Location 主动广播能力,减少 Transfer 后映射关系变化导致的 SQL 重试和读写报错;并提供主动刷新能力兜底,避免不可恢复的读写报错问题。

  • 备份恢复支持 S3

    OceanBase 的备份恢复功能已经支持两类存储介质:文件存储(NFS)和对象存储(OSS/COS)。新版本备份恢复增加支持 AWS S3(Simple Storage Service)存储服务,可以将 S3 作为日志归档和数据备份的目的端,也可以使用 S3 上的备份数据执行物理恢复。

  • 内存规格扩缩容限制

    新版本优化了内存规格扩缩容的稳定性,避免不合理的 memory_limit 配置引起系统 OOM 问题。OBServer 上 memory_limit 设置生效需要同时满足两个条件:一是 500 租户的预留内存需要不小于实际占用内存,二是 memory_limit 取值需高于 system_memoryUINT 已分配内存之和。在不满足以上限制条件时,设置 memory_limit 不会报错,也不生效。

  • Transfer 搬迁活跃事务

    单机日志流设计中,数据的单位是分片(tablet),日志的单位是日志流(logstream),多个分片聚合在一个日志流上使得单个日志流内部的事务避免了两阶段提交的高额代价。为了可以均衡不同日志流间的数据与流量,我们需要允许分片在不同日志流间可以灵活地迁移(即 Transfer)。然而在 Transfer 过程中,活跃事务依旧可能在数据上活动,简单的操作会破坏事务的 ACID 能力。举例来说,Transfer 源端的活跃事务数据若无法在并发过程中完整地迁移到目的端,事务的原子性就无法保证。在该版本之前,Transfer 执行过程中会主动杀掉活跃事务来避免产生事务问题,一定程度影响了事务的正常执行。为了优化这一问题,新版本支持了 Transfer 搬迁活跃事务的能力,允许活跃事务并发执行,并保证并发事务不会因为 Transfer 导致异常回滚或产生一致性问题。

资源使用优化

  • 事务日志 MINIMAL 模式

    新版本重构事务日志的 MINIMAL 模式,优化和完善老版本 MINIMAL 功能实现,提升功能稳定性。MINIMAL 模式开启后,在 UPDATEDELETE 等场景下产生的 Clog 日志量会有明显下降,可显著降低日志存储、归档、传输等资源开销,尤其适用于私有云跨城网络带宽、公有云写云盘带宽资源有限的场景。V4.3.0 Beta 版本,因 OMS 还未针对性适配,暂时默认关闭。在无需通过工具进行增量数据同步的场景,可以通过设置系统变量 binlog_row_image = MINIMAL 来开启该功能。

  • 内存限速机制

    在 V4.x 版本之前,需要依赖冻结转储释放内存的模块并不多,Memtable 是其中最大的一部分。因此,之前版本对 Memtable 本身设定了内存使用上限,并通过限速逻辑使其在内存使用量接近上限的过程中运行尽可能平滑,避免突然耗尽内存导致系统停写。V4.x 版本之后,随着更多依赖冻结转储释放内存的模块(如事务数据模块)被引入,新版本提供了更细致的手段来控制多个模块的内存使用,新增 TxData 和 MDS 模块内存上限控制,和 Memtable 共享内存空间,当累积内存达到 租户内存 * _tx_share_memory_limit_percentage% * writing_throttling_trigger_percentage% 时,触发整体限速。同时,新版本也增加了时间维度触发事务数据表冻结转储的功能,默认 1800 秒触发一次事务数据表的冻结,降低事务数据模块的内存占用。

  • DDL 临时结果空间优化

    在 DDL 过程中,会有很多把临时结果暂存物化结构中的流程。典型的例子有以下两个:

    1. 对于创建索引这一场景,当从数据表扫描并插入数据到索引表时,会需要对数据表扫出来的数据进行排序。排序的过程中如果内存不足,就会把当前内存中的数据暂存到物化结构中,释放内存空间以供后续的扫描,最后对这些物化结构中的数据进行归并排序。这种处理方法在内存受限时特别有效,但会带来对磁盘空间的额外需求。
    2. 在列存旁路导入场景中,系统会先把需要插入到每个 Column Group 的数据暂存在物化结构中,后续对每个 Column Group 进行插入的时候,都从物化结构中取数据。这些暂存数据的物化结构在 sort 算子中可以用来存储外部排序需要用到的中间数据,在插入 Column Group 的时候可以缓存数据从而避免重复扫表导致的额外开销。虽然这种方式能够降低重复扫描带来的性能损耗,但同样存在临时文件增大磁盘空间使用的问题。

    针对这些问题,新版本 DDL 处理流程对数据流进行了优化。首先,它剔除了不必要的冗余结构,简化了数据流;其次,它引入了对临时结果落盘的编码压缩功能。这些改进为两个场景都带来了好处:它们显著降低了 DDL 操作期间临时结果对磁盘空间的占用,从而更加高效地利用存储资源。

易用性提升

  • 监控索引

    我们对数据库执行查询操作时,往往通过创建索引来优化查询性能。但随着数据表使用的时间增长,业务场景和操作人员不断增加,很可能会存在索引越建越多的问题。未使用的索引会浪费存储空间,也会加重 DML 操作的开销。对于这种情况,需要持续观察,删除无用的索引来给系统减负。但是仅靠人力很难识别哪些索引是无用索引,因此 OceanBase 在 V4.3.0 版本新增索引使用监控功能,用户可选择打开该功能并设置采样方式,在普通租户下会将符合规则的索引使用信息记录到内存,并以 15 分钟为周期刷新到内部表中,可通过 DBA_INDEX_USAGE 视图访问,以此来感知表上的索引是否有被引用,进而选择删除无用的索引表来释放空间。

  • RPC 安全认证证书管理

    集群开启 RPC 认证时,如果有客户端(如仲裁、主备库、OBCDC 等)访问需求,需要先将客户端的 CA 根证书依次放入集群每台节点部署目录下,再做相关配置,操作较为复杂。V4.3.0 版本新增内部证书管理功能,SYS 租户下提供 DBMS_TRUSTED_CERTIFICATE_MANAGER 系统包,用于添加、删除、修改受 OBServer 集群信任的 CA 根证书。同时 SYS 租户下新增 DBA_OB_TRUSTED_ROOT_CERTIFICATE 视图,用于展示 OBServer 集群已添加的客户端 CA 根证书列表及证书过期时间等信息。

  • 配置项 RESET

    当配置项修改后,如果希望重置为默认值,目前需要先查询配置项的默认值是什么,再手动设置,易用性较差。新版本新增 ALTER SYSTEM [RESET] parameter_name [SCOPE = {MEMORY | SPFILE | BOTH}] {TENANT [=] 'tenant_name'} 语法,用于重置配置项默认值,默认值来源于执行语句节点。SYS 租户下可重置集群级配置项或指定业务租户的配置项,业务租户可重置本租户配置项。在 OBServer 中,SCOPE 这个选项在实现上并没有差异,对于静态生效的配置项,只将配置项默认值落盘,不更新内存值。对于动态生效的配置项,更新内存值并落盘。

  • 配置项数据类型细化

    新版本在配置项相关视图(如 [G]V$OB_PARAMETERS)和 SHOW PARAMETERS 命令返回结果中细化展示了配置项数据类型(data_type 列)。例如 log_disk_size 类型为 CAPACITYrpc_port 类型为 INTdevname 类型为 STRING 等,以便周边工具按配置项类型进行展示和限制。

  • LOB INROW 阈值配置

    当前 LOB 数据小于等于 4KB 时,会 INROW 存储(即行内存储),大于 4KB 时,会存入 LOB 辅助表。部分场景相比于存入辅助表,INROW 整存整取性能更好。因此该版本提供 LOB 存储模式动态配置能力,不超过行级存储规格限制的前提下,用户可根据业务需求动态调整 INROW 大小。

  • 客户端本地导入

    V4.3.0 版本新增客户端本地导入(LOAD DATA LOCAL INFILE)功能,通过流式文件处理方式完成本地文件导入,丰富了原有的数据文件导入方式。基于该功能,开发人员无需上传文件至服务器或对象存储也可进行本地文件导入测试,提高少量数据导入的工作效率。

    注意

    如果需要使用客户端本地导入功能,需要保证:

    1. OBClient 版本不低于 V2.2.4。
    2. ODP 版本不低于 V3.2.4(如果使用 ODP 连接 OceanBase 数据库,直连 OBServer 无此要求)。
    3. OceanBase Connector/J 版本不低于 V2.4.8(如果使用 Java + OceanBase Connector/J)。

    说明

    • 可以直接使用 MySQL 或 MariaDB 原生客户端,对版本没有特殊要求。
    • 关于 SECURE_FILE_PRIV 变量,是用来控制 OBServer 访问服务端机器路径的权限,对客户端本地导入功能无影响,不需要做设置。

兼容性变更

产品行为变更

功能 变更说明
Client Session ID 由 ODP 本机唯一变更为 OceanBase 集群全局唯一 在 OBServer V4.3.0 版本和 ODP V4.2.3 版本之前,Client 端通过 ODP 执行 SHOW PROCESSLIST 时,查询到的是 ODP 本机上的 Client 端会话 ID,通过 connection_id 等表达式或系统视图查询到的为 Server 端会话 ID。Client 端会话 ID 和 Server 端会话 ID 存在一对多的关系,难以在全链路统一,Session 信息查看容易混淆,不方便进行用户会话管理。新版本重构 Client Session ID 生成和维护流程,在 OBServer 版本不低于 V4.3.0 且 ODP 版本不低于 V4.2.3 时,通过各种渠道如 SHOW PROCESSLIST 命令、information_schema.PROCESSLISTGV$OB_PROCESSLIST 等视图、connection_iduserenv('sid')/userenv('sessionid')sys_context('userenv','sid')/sys_context('userenv','sessionid') 等表达式,查询到的会话 ID 均为 Client 端会话 ID,用户可通过指定 SQL 或 PL 中的 KILL 命令来管理用户端会话。
OBServer 或 ODP 版本不满足要求时,将退化为老版本的处理方式。
memory_limit 增加扩缩容限制 新版本优化了内存规格扩缩容的稳定性,避免不合理的 memory_limit 配置引起系统 OOM 问题。OBServer 上 memory_limit 设置生效需要同时满足两个条件:一是 500 租户的预留内存需要不小于实际占用内存,二是 memory_limit 取值需高于 system_memoryUINT 已分配内存之和。在不满足以上限制条件时,设置 memory_limit 不会报错,也不生效。
存储不再使用 zlib 压缩算法 V4.2.0 版本,存储已经禁止新建表使用 zlib 压缩,存量旧表仍可以使用 zlib 压缩。V4.3.0 版本存储层不再支持使用 zlib 压缩算法,用户升级前如果使用了 zlib 压缩算法,须将数据表的压缩算法改成其它压缩算法或不压缩;clog 传输、TableAPI 传输等也均禁止使用 zlib 压缩算法。
细化 archive_lag_target 使用限制 V4.3.0 版本细化了 archive_lag_target 配置项使用限制:
  1. 如果用户未设归档介质,则在修改这个配置项时提示用户 “未设置归档介质,不允许修改这个配置项默认值”。
  2. 如果用户设了归档介质为 S3,则这个配置项的最小值是 60 秒,设低于这个值则报错提示。
  3. 如果用户设了归档介质为 OSS/NFS/COS 等,则配置项可以设置为取值范围内任意值。
  4. 如果用户已经设了归档介质为 OSS/NFS/COS,通过命令将归档介质改成 S3,同时该配置项当前值小于 60 秒,则在修改归档介质命令时报错提示。
max_syslog_file_count 统一控制所有类型的系统日志数量 为了降低全链路诊断功能开启后日志盘用满的风险,max_syslog_file_count 从单独控制每类系统日志的数量变更为统一控制所有类型系统日志的总数量。在这种情况下,新版本会采用 FIFO(先进先出)策略进行日志文件的淘汰。
SHOW PARAMETERS 返回的 data_type 列细化配置项数据类型 SHOW PARAMETERS 返回的 data_type 字段细化了配置项数据类型,默认值也由 NULL 变更为 UNKNOWN
UINTMAX_IOPSMIN_IOPS 默认值变更 老版本 MIN_IOPSMAX_IOPS 均未指定时,根据 MIN_CPU 规格自动计算,1 个 Core 对应 1 万 IOPS 的值,即 MAX_IOPS = MIN_IOPS = MIN_CPU * 10000
新版本如果用户没有配置 MIN_IOPSMAX_IOPS,会把默认的 IOPS 调整为 INT64_MAX,即不对 IOPS 资源进行约束。

视图变更

视图 变更类型 变更说明
DBA_OB_TRUSTED_ROOT_CERTIFICATE 新增 SYS 租户下视图,用于展示 OBServer 集群信任的客户端 CA 根证书列表及证书过期时间等信息。
CDB/DBA_MVIEW_LOGS 新增 用于描述物化视图日志信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVIEWS 新增 用于描述物化视图信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_STATS_SYS_DEFAULTS 新增 用于描述物化视图刷新历史统计属性的参数的系统范围默认值。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_STATS_PARAMS 新增 用于显示与每个物化视图关联的刷新统计信息属性。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_RUN_STATS 新增 用于描述物化视图每次刷新运行的信息,每次运行均由 REFRESH_ID 标识。该信息包括与运行相关的计时统计信息以及运行中指定的参数。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_STATS 新增 用于描述物化视图刷新的基本计时统计信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_CHANGE_STATS 新增 用于描述所有物化视图的刷新运行关联的基表上的更改数据加载信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_MVREF_STMT_STATS 新增 用于描述刷新语句关联的信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_INDEX_USAGE 新增 用于展示索引访问信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
DBA_OB_CLONE_PROGRESS 新增 SYS 租户下视图,用于记录运行中的租户克隆任务信息。
DBA_OB_CLONE_HISTORY 新增 SYS 租户下视图,用于记录运行完成的克隆任务信息。
CDB/DBA_OB_AUX_STATISTICS 新增 用于展示每个租户的辅助统计信息。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
[G]V$OB_TABLET_COMPACTION_HISTORY 列变更 新增列 KEPT_SNAPSHOT 用于展示多版本保留位点信息;新增列 MERGE_LEVEL 用于展示宏块/微块重用信息;调整列 COMMENTS 的宽度。
[G]V$OB_PARAMETERS 列内容变更 DATA_TYPE 字段细化配置项数据类型,默认值也由 NULL 变更为 UNKNOWN
[G]V$OB_PROCESSLIST 新增列 新增列 USER_CLIENT_PORT 用于展示客户端 Port
CDB/DBA_OB_RECOVER_TABLE_JOBS 新增 用于展示表级恢复任务记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_OB_RECOVER_TABLE_JOB_HISTORY 新增 用于展示表级恢复任务历史记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_OB_IMPORT_TABLE_JOBS 新增 用于展示跨租户导入的 JOB 记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_OB_IMPORT_TABLE_JOB_HISTORY 新增 用于展示跨租户导入的 JOB 历史记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_OB_IMPORT_TABLE_TASKS 新增 用于展示表级别跨租户导入的 TASK 记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
CDB/DBA_OB_IMPORT_TABLE_TASK_HISTORY 新增 用于展示表级别跨租户导入的 TASK 历史记录。CDB 视图仅在 SYS 租户支持,DBA 视图在所有租户支持。
[G]V$OB_SQL_AUDIT 新增列 新增列 PLSQL_EXEC_TIME,用于展示 PL 执行耗时(不包括 SQL 执行时间),单位是 us
[G]V$OB_LS_SNAPSHOTS 新增 用于展示 UNIT 中物理存在的日志流快照信息。

配置项变更

配置项/系统变量 变更类型 变更说明
enable_rpc_authentication_bypass 新增 新增集群级配置项,在 OBServer 开启 RPC 安全认证的场景下,用于设置是否允许 OMS 迁移服务绕过 RPC 安全认证连接集群。
default_compress_func 取值范围变更 新增 zlib_lite_1.0 取值,表示在具备硬件加速特性的环境,使用更高性能的 zlib 压缩算法。删除 zlib_1.0 取值,建表禁用 zlib_1.0 压缩算法。
large_query_threshold 取值范围变更 配置项取值范围由 [1ms, +∞) 变更为 [0ms, +∞),取值为 0 时表示关闭大查询判定功能。
default_table_store_format 新增 新增租户级配置项,用于控制用户租户创建主表的默认格式。默认为 row,表示建表不指定 with column group 的情况下,默认为行存表。可根据需求修改为 column(默认纯列存表)或 compound(默认冗余行存列存表)。
server_cpu_quota_min 生效模式变更 生效模式由重启生效变更为立即生效。
server_cpu_quota_max 生效模式变更 生效模式由重启生效变更为立即生效。

函数/PL包变更

函数/PL包 变更类型 变更说明
ob_transaction_id 新增函数 新增 ob_transaction_id() 内置函数,用于查看当前会话的事务 ID,如果会话未处于活跃事务中,返回 0。
DBMS_TRUSTED_CERTIFICATE_MANAGER 新增包 SYS 租户下新增 PL 系统包,支持 ADD_TRUSTED_CERTIFICATEDELETE_TRUSTED_CERTIFICATEUPDATE_TRUSTED_CERTIFICATE 三个子程序,用于添加、删除、修改受 OBServer 集群信任的客户端 CA 根证书。在开启 RPC 认证时使用。

语法变更

  • 新增 SSTable 列预聚合索引 DDL 语法。
  • 新增租户克隆相关命令。
  • 新增配置项重置语法。
  • 新增列存与列存索引语法。
  • 新增物化视图相关语法。
  • 新增分区级合并相关语法。

周边配套

OceanBase 数据库 V4.3.0_CE 版本推荐使用的平台工具版本如下。

组件 版本 备注
ODP V4.2.3 -
OCP V4.2.2 -
OBD V2.7.0 -
ODC V4.2.3 BP1 -
OBCDC V4.3.0 -
OMS V4.2.3 -
OBClient V2.2.3 -
LibOBClient V2.2.3 -

升级说明

  • 初次使用 V4.3.0 版本时,需新建集群使用。暂不支持 V1.x、V2.x、V3.x、V4.1、V4.2.x 等低版本到 V4.3.0 Beta 版本的平滑升级。如有升级需求,请使用逻辑导入导出方式。
  • 后续会支持 V4.3.0 Beta 版本升级到 V4.3.x 新版本。
  • 随着版本演进,后续会增加 V4.2.x 到 V4.3.x 的升级路径支持。

注意事项

  • 建议 旁路导入最大并行度 = 租户内存 * 0.001 / 2M,超过此并行度,可能会存在临时文件内存不足的情况。使用旁路导入时,建议将大文件切割成多个小文件,多文件导入可提高导入效率。
  • OMS 尚未适配 MINIMAL 模式。通过工具进行增量数据同步的场景,暂时禁止将 binlog_row_image 设为 MINIMAL(默认值是 FULL)。

开源鸣谢

在此版本发布中,特别感谢社区伙伴的贡献: