Releases: 4paradigm/OpenMLDB
Releases · 4paradigm/OpenMLDB
v0.9.0
Breaking Changes
- Upgrade SQLAlchemy to 2.0.27 and SQLAlchemy 1.x is not supported(#3805 @yht520100)
- Correct semantic for
first_value
which will be compatible with ANSI SQL(#3861 @aceforeverd) - Set execute_mode default to
online
which wasoffline
before 0.9.0(#3862 @aceforeverd) - The client-side authentication is deprecated and we can enable authentication in server-side(#3835 #3885 @oh2024)
Features
- Support offline building docker image(#3773 #3787 @QiChenX, #3778 @aceforeverd)
- Support MySQL protocol with new module OpenM(ysq)LDB(#3800 @tobegit3hub #3816 #3820 #3823 #3824 #3831 @yangwucheng)
- Support map datatype for SQL engine(#3841 #3847 @aceforeverd)
- Support TiDB backend for online and offline storage(#3815 #3839 @yht520100)
- Kafka connector support string timestamp and partial insert(#3834 @vagetablechicken)
- Support arbitrary spark distribution and remove dependency of OpenMLDB Spark(#3849 @tobegit3hub)
- Support insert function in offline mode(#3854 @Matagits)
- Support feature signature with SQL functions(#3877 @wyl4pd)
- Support request mode in raw SQL(#3874 @aceforeverd)
Bug Fixes
- Fix delete with same ts(#3780 @dl239)
- Fix '\x00' Issue in Python SDK Precompiled SQL Inserts(#3788 @yht520100)
- Fix import spark config when creating like hive(@3792 @vagetablechicken)
- Fix select constant return empty value in offline (#3825 @Matagits)
- Fix show and load sql to print result set for notebook magic function(#3856 @tobegit3hub)
- Fix TTL merge for different types(#3859 @vagetablechicken)
- Fix ddl parser when getting duplicated column keys(#3873 @vagetablechicken)
- Fix initializing nameserver bug when calling zk RegisterName(#3869 @oh2024)
Testing
- Set NPROC in intergration test(#3782 @dl239)
- Support map data type in yaml testing framework(#3765 @aceforeverd)
- Add automatic table cleanup after go sdk tests(#3799 @oh2024)
- Fix sql_cmd_test and no impl for MakeMergeNode(#3829 @aceforeverd)
- Add query performance benchmark(#3855 @gaoboal)
v0.8.5
Features
- Support Iceberg as an offline storage (#3737 @vagetablechicken)
- Support
UNION ALL
statement (#3590 #3653 @aceforeverd) - Support
SELECT ... INTO OUTFILE
to OpenMLDB online tables (#3616 @tobegit3hub) - Support
LAST JOIN
andWINDOW
withoutORDER BY
in offline mode (#3619 @aceforeverd) - Support
CREATE/ALTER/DROP USER
statement (#3678 #3745 #3747 @dl239, #3744 @tobegit3hub) - Support specifying Spark configurations in the SDK (#3613 @tobegit3hub)
INSERT
returns failure if the server-side memory usage exceeds the specified limit (#3631 @dl239)- Add a new interfact for SQLs to DAG (#3630 @aceforeverd)
- If the deployed SQL contains the
LEFT JOIN
, the indexs will be created automatically. (#3667 @aceforeverd) - Support automatic deletion of logs (#3704 #3736 #3706 @dl239)
- Support
absandlat/absorlat
TTL type for disktable (#3716 @dl239) - Optimize error messages for insert failures (#3725 @vagetablechicken)
- Improve the documents (#3617 #3519 #3690 #3699 @vagetablechicken, #3612 @dl239, #3609 #3672 #3687 @aceforeverd, #3649 #3570 #3569 @TanZiYen @Elliezza, #3665 @DrDub, #3585 #3584 #3579 #3578 #3574 #3573 #3552 #3539 #3488 #3477 #3475 #3586 #3470 #3474 #3568 #3583 #3564 #3764 @TanZiYen, #3688 #3697 #3753 #3721 #3731 #3739 #3754 #3720 #3756 #3762 #3752 #3757 #3719 @Elliezza, #3075 @Elliezza @tobegit3hub, #3710 @tobegit3hub)
- Other minor features (#3623 #3636 @aceforeverd, #3651 @tobegit3hub, #3641 #3692 @vagetablechicken, #3582 #3702 @dl239, #3674 @lqy222)
Bug Fixes
- Executing offline tasks will consume an excessive number of ZooKeeper connections (#3642 @dl239)
- There is no automatic reconnection after ZooKeeper disconnection in SDK. (#3656 #3668 @vagetablechicken)
- The
FlexibleRowBuilder
will throw aNullPointerException
if set with a null value (#3649 @dl239) - If the length of strings in the imported data exceeds 255, it may throw a
BufferOverflowException
. (#3729 @ljwh) - If a table contains a large amount of data, querying the data may still be possible after executing
TRUNCATE
(#3677 @dl239) - After deleting data, it can still be retrieved from other indexes (#3693 @dl239)
- Delete dirty data when insertion fails (#3681 @dl239)
- When there are no tables, getting databases using
GetAllDbs
fails. (#3742 @vagetablechicken) - Adding a deleted index fails if the index name is different from the previous name (#3635 @dl239)
- Other minor bug fixes (#3638 #3654 #3717 #3726 #3743 @vagetablechicken, #3607 #3775 @dl239, #3640 @tobegit3hub, #3686 #3735 #3738 #3740 @aceforeverd, #3759 @yangwucheng)
Code Refactoring
Experimental built artifacts for Linux AArch64
49b691a8a2dc7175823e9fb808e731a8999896cc3ab819cfd32f1ab10c299cde openmldb-0.8.5-linux-gnu-aarch64.tar.gz
v0.8.4
Features
- Support new SQL statements
SHOW CREATE TABLE
,TRUNCATE
and [Alpha]LEFT JOIN
(#3500 #3542 @dl239, #3576 @aceforeverd) - Support specifying the compression option during table creation (#3572 @dl239)
- Optimize the insertion performance of Java SDK (#3525 @dl239)
- Support defining a window without
ORDER BY
clause (#3554 @aceforeverd) - Support the authentication for Zookeeper connection (#3581 @dl239)
- [Alpha] Support
LAST JOIN
on a window clause (#3533 #3565 @aceforeverd) - Enhance the monitoring module (#3588 @vagetablechicken)
- Support the date before 1900 in
datediff
(#3499 @aceforeverd) - Enhance the diagnostic tool (#3559 @vagetablechicken)
- Check the status of table on CLI startup (#3506 @vagetablechicken)
- Upgrade the version of brpc to 1.6.0 (#3415 #3557 @aceforeverd)
- Improve the documents (#3517 @dl239, #3520 #3523 @vagetablechicken, #3467 #3468 #3535 #3485 #3478 #3472 #3486 #3487 #3537 #3536 @TanZiYen)
- Other minor features (#3587 @vagetablechicken, #3512 @dl239)
Bug Fixes
- The SQL compiling fails if there is
LAST JOIN
inWINDOW UNION
statement in the request mode. (#3493 @aceforeverd) - Tablet may crash after deleting an index in certain cases (#3561 @dl239)
- There are some syntax errors in maintenance tools (#3545 @vagetablechicken)
- Updating TTL fails if the deployment SQL contains multpile databases (#3503 @dl239)
- Other minor bug fixes (#3518 #3567 #3604 @dl239, #3543 @aceforeverd, #3521 #3580 @vagetablechicken, #3594 #3597 @tobegit3hub)
Code Refactoring
v0.8.3
Features
- Optimize the performance of Java SDK (#3445 @dl239)
- Optimize the writing performance and significantly reduce the memory consumption of the Spark connector (#3443 @vagetablechicken)
- Support loading data from HIVE with customized SQLs (#3380 @tobegit3hub)
- Improve the output message for SDK and CLI (#3384 @vagetablechicken, #3434 #3494 @dl239)
- Support new built-in functions
json_array_length
andget_json_object
(#3414 #3429 @aceforeverd) - Add new options
RANGE_BIAS
andROWS_BIAS
for theDEPLOYMENT
statement (#3456 @vagetablechicken) - Support
const
project in online mode (#3376 @aceforeverd) - Support
SHOW DEPLOYMENT
andDROP DEPLOYMENT
with a database name (#3353 @emo-coder) - Support inheriting environment variables for Spark (#3450 @vagetablechicken)
- Support deleting HDFS files when dropping tables (#3369 @tobegit3hub)
- Enhance the diagnostic tool (#3330 @zhangziheng01233)
- Enhance the operation tool (#3455 @dl239)
- Use the timeout value set by an user only if that is greater than the default value (#3484 @vagetablechicken)
- Remove the sync tool from the demo docker image (#3390 @dl239)
- Improve the documents (#3383 #3392 #3410 @vagetablechicken, #3175 #3447 ##3463 @TanZiYen, #3436 @aceforeverd, #3451 @wangerry, #3453 #3462 #3498 @dl239)
Bug Fixes
CREATE TABLE LIKE HIVE
returns success even if a database is not found (#3379 @emo-coder)- If an error occurred when executing
DROP FUNCTION
, the function cannot be deleted again. (#3362 @vagetablechicken, #3441 @dl239) - The results of
SHOW JOBS
are not sorted byid
(#3371 @emo-coder) - NameServer will crash if creating system tables fails. (#3432 @dl239)
CREATE INDEX
may fail if the previousCREATE INDEX
command on the same table has not finished. (#3393 @dl239)- The result of
SELECT
on the deleted index column is empty (#3426 @dl239) - Other minor bug fixes (#3391 #3408 @vagetablechicken, #3386 #3427 #3459 @dl239, #3367 #3495 @aceforeverd)
Code Refactoring
#3397 @emo-coder, #3411 @vagetablechicken, #3435 @aceforeverd, #3473 @lqy222
Breaking Changes
v0.8.2
Features
- Enhance the
delete
statement (#3301 #3374 @dl239) - Enhance the C++ SDK (#3334 @vagetablechicken)
- Support the new option
IF EXISTS
in theDROP TABLE/DATABASE
statement (#3348 @emo-coder) - Improve the documents (#3344 #3152 #3355 #3360 @vagetablechicken, #3341 @aceforeverd, #3343 #3372 @dl239, #2968 @selenachenjingxin)
- Upgrade the version of Kafka connector to
10.5.0-SNAPSHOT-0.8.1
(#3365 @vagetablechicken)
Bug Fixes
- Loading external UDF libraries fails under certain environment when runing an offline task (#3350 #3359 @vagetablechicken)
- Loading data fails with Hive soft links (#3349 @vagetablechicken)
- Insertion succeeds but with an invalid timestamp (#3313 @aceforeverd)
- The bool type is not properly packed in APIServer. (#3366 @vagetablechicken)
- The table can be created successfully when there are duplicated indexs. (#3306 @dl239)
Breaking Changes
- The field
Offline_deep_copy
will be replaced byOffline_symbolic_paths
in the result ofSHOW TABLE STATUS
#3349.
v0.8.1
Features
- Support a new SQL statement
ALTER TABLE ... ADD/DROP OFFLINE_PATH ...
(#3286 @aceforeverd, #3323 @tobegit3hub) - Support deploying SQLs in which the involved tables have data but without pre-aggregation defined (#3288 @dl239)
- Support new built-in functions
top_n_value_ratio_cate
,top_n_key_ratio_cate
,list_except_by_key
andlist_except_by_value
(#3329 @aceforeverd) - Add a new SDK API to merge multiple SQLs for deployment (#3297 @vagetablechicken)
- Support mapping topic tables in the Kafka connector (#3282 @vagetablechicken)
- Support deploying the Kafka connector in Docker and Kubernetes (#3276 @tobegit3hub)
- Support fetching jobs from NameServer (#3293 @dl239)
- Enhance the diagnostic tool (#3224 #3208 #3285 #3258 #3303 @zhangziheng01233)
- Enhance the
SELECT INTO ...
statement (#2529 @vagetablechicken) - Improve the documents (#3308 @aceforeverd, #3333 @TanZiYen)
- Other minor features (#3312 #3314 @vagetablechicken, #3298 @aceforeverd)
Bug Fixes
- SQL deployment fails in some cases (#3328 @vagetablechicken)
- Creating UDFs/UDAFs may fail as the
udf
directory does not exist by default. (#3326 @vagetablechicken) - Other minor bug fixes (#3281 #3284 @vagetablechicken)
Code Refactoring
v0.8.0
Features
- Add a new synchronization tool to automatically synchronize data from online storage to offline storage (#3256 @vagetablechicken)
- Support the new built-in functions
var_samp
,var_pop
,entropy
,earth_distance
,nth_value_where
andadd_months
(#3046 #3193 @aceforeverd) - Support batch read for openmldb-spark-connector (#3070 @tobegit3hub)
- [Alpha] Support Kubernetes as a TaskManager backend for the offline engine (#3147 #3157 #3185 @tobegit3hub)
- Support LAST JOIN over WHERE clause (#3134 @aceforeverd)
- Support LAST JOIN in WINDOW UNION clause (#3205 @aceforeverd)
- Supports a decimal place as the second parameter in the function
round
(#3221 @aceforeverd) - Support Amazon S3 as an offline data source (#3229 #3261 @tobegit3hub)
- Add the new option
SKIP_INDEX_CHECK
to skip index checking when deploying SQLs (#3241 @dl239) - Support symbolic paths for offline table (#3235 @tobegit3hub)
- Improve the documents (#3104 #2993 @selenachenjingxin, #3113 #3118 #3239 @tobegit3hub, #3150 #3184 #3237 #3255 @aceforeverd, #3160 #3195 #3197 #3223 @lumianph, #3192 #3215 @haseeb-xd, #3201 #3220 #3232 #3236 #3254 @vagetablechicken, #3213 @alexab612, #3189 #3199 @TanZiYen)
- Other minor features (#3115 #3143 #3182 @tobegit3hub, #2818 #3123 @aceforeverd, #3128 #3127 @dl239)
Bug Fixes
- There is curator conflict problem when executing offline SQLs under certain circumstances. (#3090 @tobegit3hub)
CREATE TABLE ... LIKE HIVE ...
statement execution fails if there is no database in a table name. (#3063 @tobegit3hub)- The CLI shows
success
even thoughCREATE TABLE ... LIKE ...
execution fails. (#3080 @tobegit3hub) SELECT ... INTO ...
statement execution fails in the offline mode if there is no source table. (#3116 @tobegit3hub)- Compilation fails when
SELECT *
over two LAST JOIN. (#3117 @aceforeverd) - The syncing job thread will enter an infinite loop if querying
JOB_INFO
fails. (#3169 @vagetablechicken) - SQLs deployment fails if there are more than one conditions on a JOIN statement. (#3196 @vagetablechicken)
- Registered tables cannot be fetched when enabling SparkSQL for the offline engine (#3057 @tobegit3hub)
- Other minor bug fixes (#3097 #3095 @dl239, #3109 #3141 #3162 #3234 @aceforeverd, #3096 #3112 @tobegit3hub, #3231 #3251 @vagetablechicken)
Code Refactoring
v0.7.3
Features
- Support C/C++ based User-Defined Aggregated Functions (UDAFs) (#2825 @dl239)
- Improve the diagnostic tool to support a few new sub-commands (#3106 @vagetablechicken)
- Add a new script to modify the node environment configuration (#3142 @dl239)
- Change the default value of
max_traverse_cnt
to unlimited to avoid result truncated when performing queries in CLI (#2999 @dl239) - Improve the documents (#3111 #3093 #3119 @selenachenjingxin, #3105 #3125 #3120 @vagetablechicken, #3114 #3126 @dl239, #3128 @lumianph)
Bug Fixes
- The user-provided
SPARK_HOME
does not work in the deployment scripts. (#3085 @zhanghaohit) - The result of
SELECT timestamp(string_val)
is incorrect at the offline mode. (#3088 @tobegit3hub)
Code Refactoring
Note:
If the configuration of a tablet has not been updated when upgrading to this new version, the query result still may be truncated as the old version (#2999).
v0.7.2
Features
- [Alpha] Support the new SQL clause
WITH
(#2846 @aceforeverd) - Support deploying multiple TaskManagers (#3004 @zhanghaohit)
- Support the new built-in functions
std
,stddev
,stddev_samp
,stddev_pop
,ew_avg
anddrawdown
(#3025 #3032 #3029 @zhanghaohit) - Add the new configurations to specify the maximum size of RocksDB's log files (#2991 @dl239)
- The
CREATE TABLE ... LIKE PARQUET ...
statement supports a parquet file as the input in the offline mode. (#2996 @tobegit3hub) - Support showing query results of synchronous jobs in TaskManager (#3034 @vagetablechicken)
- Change the default timeout of synchronous jobs to 30 minutes, and add a corresponding CLI parameter for configuration (#3061 @vagetablechicken)
- Improve the documents (#2938 #2984 #3016 @vagetablechicken, #2958 #2973 #2980 #2987 #2988 #3035 @lumianph, #2990 @lukeAyin, #2997 #3065 @tobegit3hub, #3011 #3027 @dl239, #3020 #3066 #3071 #3074 @aceforeverd, #3033 #3036 @selenachenjingxin)
Bug Fixes
- Disk table does not clean the expired data. (#2963 @dl239)
- Incorrect index will be added if there is
LAST JOIN
statement in a deployed SQL. (#2979 @dl239) - The result is incorrect if a window frame is specified by
EXCLUDE CURRENT_ROW
(#2930 @aceforeverd) - SQL compiling fails if there is an UDF function in an UDAF expression. (#3018 @aceforeverd)
- Although the return information indicates success, index creation may still fail in some cases. (#3042 @vagetablechicken)
- The
recoverdata
command fails if there are a large number of records in a memory table. (#3060 @dl239) - The
deploy-all
tool deploys the Spark package to local nodes only. (#3022 @zhanghaohit) - Other minor bug fixes (#2970 #3028 #3026 #3003 #3064 @dl239)
Code Refactoring
Note:
While we have resolved the overflow issue in the current version of the monitor component #3003, it may still persist when upgrading from an older version.
v0.7.1
Features
- Support data import from Hive using a symbolic link (#2948 @vagetablechicken)
- Support the new SQL statement
CREATE TABLE LIKE
(#2949 @aceforeverd, #2962 @tobegit3hub) - Improve the non-interactive CLI (#2898 @vagetablechicken)
- Improve the documents (#2904 #2921 #2932 #2942 @selenachenjingxin, #2925 #2928 #2934 #2954 @vagetablechicken, #2924 @dl239, #2945 #2952 @lumianph, #2946 @aceforeverd)
Bug Fixes
- The result of
_*_cate
is incorrect. (#2939 @zhanghaohit) - The deployment of SQL fails if the column name of a major table is a keyword. (#2894 @dl239)
- Tablet may core dump when executing SQLs with disk tables. (#2926 @dl239)
- There is memory leak when writing data into disk tables. (#2943 @dl239)
- The result of
show components
is incorrect in certain cases. (#2940 @dl239) - Offline jobs execution fails in certain cases because the
Curator
component causes an incompatible issue. (#2936 @tobegit3hub) - Disabling the monitor log (#2953 @dl239)