Releases: apache/incubator-gluten
Releases · apache/incubator-gluten
v1.3.0-preview
What's Changed
- VL Make velox writer queue size configurable @yikf #6341
- VL Remove useless ctx variable @gaoyangxiaozhu #6348
- [1632]CHDaily 20240706) @kyligence-git #6359
- VL fix build bundle package @zhouyuan #6364
- VL Fix process_setup_alinux3 arrow CMakeLists.txt path @liujiayi771 #6363
- VL Daily 2024_07_08) @GlutenPerfBot #6366
- [6262]CHJson input format ignore key case @KevinyhZou #6263
- [6285]VL Add debian10 vcpkg depends @wenwj0 #6286
- [CELEBORN] CelebornShuffleManager#stop should stop non-null _vanillaCelebornShuffleManager @SteNicholas #6371
- VL Update ubuntu docker to use cmake 3.28 @boneanxs #6373
- [6304]CHSupport array_join @KevinyhZou #6305
- VL Daily 2024_07_09) @GlutenPerfBot #6376
- [6378]CH Support delta count optimizer for the MergeTree format @zzcclp #6379
- [6345]CH Deprecate SCALAR_FUNCTIONS in SerializedPlanParser @lgbo-ustc #6347
- [TEST] Use project version rather than Gluten version in Gluten it @ulysses-you #6385
- [6377]CH Support window function
percent_rank
@lgbo-ustc #6386 - VL Minor refactor for ValueStream node construction and usage @Yohahaha #6382
- VL Enable levenshtein function @zhli1142015 #6389
- VL Daily 2024_07_10) @GlutenPerfBot #6384
- [1632]CHDaily 20240710) @kyligence-git #6383
- Test input_file_name, input_file_block_start & input_file_block_length when scan falls back @gaoyangxiaozhu #6318
- [6394]VL Fix the vcpkg package script @weixiuli #6395
- [6288]CH Support BroadcastNestedLoopJoinExe[Part one] @loneylee #6290
- [CELEBORN] Rename CelebornHashBasedColumnarShuffleWriter to CelebornColumnarShuffleWriter @kerwin-zk #6391
- VL Fix E function fallback issue in some condition @gaoyangxiaozhu #6397
- [CI] Fix centos7 failure @marin-ma #6404
- [1632]CHDaily 20240711) @kyligence-git #6399
- [CELEBORN] Add compression for row-based shuffle @kerwin-zk #6380
- VL Daily 2024_07_11) @GlutenPerfBot #6400
- CORE Remove local sort for TopNRowNumber @ulysses-you #6381
- VL Spark assert_true function support @gaoyangxiaozhu #6329
- VL Add schema validation for all operators @zhli1142015 #6406
- CORE Minor code cleanups against fallback tagging @zhztheplayer #6320
- VL Try to find arrow libs from velox bundled path firstly @PHILO-HE #6413
- VL disable tpch benchmarks on comment/merge @zhouyuan #6402
- [UT] Add a tool to validate any unary expression with all its accepted types @PHILO-HE #6392
- CH Fix a source file name typo @zhztheplayer #6412
- VL Fix Pi function fallback issue in some condition @gaoyangxiaozhu #6408
- [CELEBORN] VeloxCelebornColumnarBatchSerializer uses the key and default value of SHUFFLE_COMPRESS to check whether to compress shuffle output @SteNicholas #6414
- VL Quick fix for commit conflicts @zhztheplayer #6418
- [Doc] Update new supported spark functions @gaoyangxiaozhu #6423
- VL Add a test to validate substring_index @boneanxs #6393
- VL Fix shuffle spill triggered by evicting buffers during stop @marin-ma #6422
- VL Enable repeat function @zhli1142015 #6419
- VL Accelerate Arrow compile @jinchengchenghh #6426
- [CI]VL Update docker image for CI @zhouyuan #6401
- VL Daily 2024_07_12) @GlutenPerfBot #6417
- VL Daily 2024_07_13) @GlutenPerfBot #6436
- VL Daily 2024_07_14) @GlutenPerfBot #6441
- VL Set Arrow_SOURCE to AUTO to allow using system arrow libs @PHILO-HE #6325
- [CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle in ClickHouse backend @SteNicholas #6432
- VL Make sure the same thrift lib bundled in arrow build is used for building Velox @zhztheplayer #6431
- CORE Make SparkSession transient in HiveTableScanExecTransformer @yikf #6410
- [6176]CH Add tpcds suite from decimal table schema @loneylee #6369
- VL Move dependencies setup ahead @PHILO-HE #6444
- CH[CELEBORN] CHCelebornColumnarShuffleWriter supports celeborn.client.spark.shuffle.writer to use memory sort shuffle in ClickHouse backend @SteNicholas #6454
- VL Enable right and anti join in smj @JkSelf #6449
- CH[CELEBORN] CHCelebornColumnarBatchSerializer uses AtomicBoolean to identify whether to call close() to avoid calling close() twice situation @SteNicholas #6455
- [CI]VL Re-enable a build job running on clean dockers weekly @PHILO-HE #6424
- CORE Update LICENSE, NOTICE, LICENSE-binary, NOTICE-binary @weiting-chen #6443
- CORE Change DISCLAIMER to DISCLAIMER-WIP @weiting-chen #6442
- VL RAS: Minor code cleanup for offloading project @zhztheplayer #6452
- VL Add a way to create static build with docker container and gluten-te @zhztheplayer #6457
- [6467]CH Minor Fix Build @baibaichen #6468
- VL Minor improvements and fixes for gluten-it and gluten-te @zhztheplayer #6471
- CORE Fix fallback for spark sequence function with literal array data as input @gaoyangxiaozhu #6433
- VL Fix offload input_file_name assert error @zml1206 #6390
- VL update docker image for cache-native-lib job @yma11 #6466
- [BUILD] Fix unbound variable @zml1206 #6474
- VL Daily 2024_07_16) @GlutenPerfBot #6460
- [6437][BUILD] Fix vcpkg setup-build-dependens.sh for centos @wecharyu #6438
- [6470]CHFix Task not serializable error when inserting mergetree data @zzcclp #6473
- [6425]CH Support day time internval @lgbo-ustc #6456
- VL remove redundant code in parquet datasource to avoid memory leakage PR6430 @liujp #6462
- CORE Spark version function support @gaoyangxiaozhu #6469
- VL Daily 2024_07_17) @GlutenPerfBot #6479
- VL Minor improvements on gluten-it / gluten-te toolchains @zhztheplayer #6476
- CH Support merge MergeTree files @liuneng1994 #6472
- [6463]CHrefactor the code of parsing join parameters @lgbo-ustc #6485
- [1632]CHDaily 20240718) @kyligence-git #6491
- VL Daily 2024_07_18) @GlutenPerfBot #6492
- [6495]VL Fix build issue: --build_arrow=ON wipes --build_type= setting silently @PHILO-HE #6498
- VL RAS: Make default rough cost model exhaustively offload computations @zhztheplayer #6493
- VL Print exception early when raised from ManagedReservationListener#unreserve @zhztheplayer https://github.com/apache/inc...
v1.2.1
Highlight
- 3 Shuffle, Spill related bug fix
- 5 RSS(Celeborn, Uniffle) related bug fix
- 4 Compile & Package related bug fix
- 10 CI/CD related bug fix
- Move to use OAP's Velox v1.2.2
- 4 major issue fixed in OAP's Velox
- More minor bug fix, please check below full list
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
- [VL] Branch 1.2: Port #8047 to fix libelf link by @zhztheplayer in #8059
- [VL][Branch-1.2] Port #8034 & #8027 for fixing march flag setting and #8042 for fixing GHA failure on centos-7 by @PHILO-HE in #8075
- [CORE][BRANCH-1.2] Port #7861 to fix OOM in shuffle writer by @ccat3z in #8078
- [1.2] Preparing for Gltuen v1.2.1-rc0 by @weiting-chen in #8110
Full Changelog: v1.2.0...v1.2.1
v1.2.1-rc0
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
- [VL] Branch 1.2: Port #8047 to fix libelf link by @zhztheplayer in #8059
- [VL][Branch-1.2] Port #8034 & #8027 for fixing march flag setting and #8042 for fixing GHA failure on centos-7 by @PHILO-HE in #8075
- [CORE][BRANCH-1.2] Port #7861 to fix OOM in shuffle writer by @ccat3z in #8078
- [1.2] Preparing for Gltuen v1.2.1-rc0 by @weiting-chen in #8110
Full Changelog: v1.2.0...v1.2.1-rc0
v1.2.1-preview
What's Changed
- [VL][1.2] Upgrade GHA artifacts version to 3 by @weiting-chen in #7293
- [VL] Port CI changes to branch-1.2 and pick simdjson related fix by @PHILO-HE in #7314
- [CORE][1.2] Bump branch-1.2 version to 1.2.1-SNAPSHOT by @weiting-chen in #7290
- [VL] Follow-up fix for #7314 on branch-1.2: skip data gen in oom test by @PHILO-HE in #7329
- [VL][1.2] Install devtoolset-9 for centos7 build native lib GHA by @wForget in #7678
- [GLUTEN-7037][VL][1.2] Add dwarf dependency to folly when building with vcpkg by @wForget in #7699
- [CK][1.2] Support trigger CK Backend CI/CD in branch1.2 by @weiting-chen in #7936
- [VL][1.2] Port 6563 6679 for build options and collectQueryExecutionFallbackSummary fix by @weiting-chen in #7919
- [VL][1.2] Port 6432 6657 for Celeborn bug fix in branch 1.2 by @weiting-chen in #7922
- [VL][1.2] Port #6573 #7025 #7132 by @weiting-chen in #7973
- [VL][1.2] Port #6560 #6569 #6730 #7117 for vcpkg issue fix by @weiting-chen in #7974
- [VL][1.2] Port #7121 #7448 by @weiting-chen in #7988
- [VL] Branch 1.2: Backport fixes for #7243 by @zhztheplayer in #7943
- [GLUTEN-7126][CORE][1.2] Fix issue that unsupported join type in BNLJ is not fallback by @ccat3z in #7569
- [GLUTEN-7126][VL][1.2] Port Fix shuffle spill triggered by evicting buffers during stop (#6422) by @kecookier in #7991
- [GLUTEN-7126][VL][1.2] Port #6698 #7525 #7560 for Uniffle bug fix by @weiting-chen in #7994
Full Changelog: v1.2.0...v1.2.1-preview
v1.2.0
Release Notes - Gluten version 1.2.0
We are pleased to announce that Gluten v1.2.0 has been published as 1st official Apache release.
Highlights (Velox backend only)
- Support Spark 3.2.2, 3.3.1, 3.4.2, and 3.5.1 with all UTs passed(if data type supported)
- Support 31 common Spark Operators(based on Spark3.2)
- Support 266 common Spark Functions(based on Spark3.2)
- Velox codebase updated to 2024/07/05
- New RSS support: add Apache Uniffle integration
- New Data Lake support: Iceberge, Delta Lake
- New File Format Support: CSV
- Enhanced CI workflow
- Refresh Documentations in Gluten website(https://gluten.apache.org/)
- More Stability in Spill, OOM, and other cases support
- More Bug Fixing
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in https://github.com/apache/i...
v1.2.0-rc3
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.2.0-rc2
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.2.0-rc1
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.2.0-rc0
What's Changed
- [CORE] Move all columnar rules to post-columnar transitions by @zhztheplayer in #4790
- [GLUTEN-4398][FOLLOW] Mask PullOutPostProject and PullOutPreProject id by @zwangsheng in #4815
- [GLUTEN-2956][VL] Support Spark NullType by @PHILO-HE in #2996
- [CORE] Add logical link to rewritten spark plan by @ulysses-you in #4817
- [GLUTEN-4803][UT] Add Golden Files for TPC-H Spark33 + Gluten Execution Plan by @zwangsheng in #4804
- [VL] Allow replacing installed minio package by @PHILO-HE in #4825
- [VL] Daily Update Velox Version (2024_03_01) by @GlutenPerfBot in #4821
- [VL] Enable more tests of GlutenParquetIOSuite for Spark32/33/34 by @Yohahaha in #4823
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240302) by @lwz9103 in #4837
- [GLUTEN-4039][VL] support map_keys and map_values by @konjac in #4826
- [GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten by @JkSelf in #4822
- [VL] Daily Update Velox Version (2024_03_04) by @GlutenPerfBot in #4841
- [GLUTEN-4813] Replace resize/reserve to resize_extact/reserve_exact to reduce memory by @taiyang-li in #4824
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240305) by @lwz9103 in #4849
- [VL] Fix boost installation issue and remove useless QueryCtx by @PHILO-HE in #4850
- [VL] Enable "parquet v2 pages - delta encoding" test for Spark33/Spark34 by @Yohahaha in #4816
- [CORE] Support FileSourceScanExec driver metrics for spark3.4/3.5 by @zhli1142015 in #4848
- [GLUTEN-4772][VL] Support empty map/array literal by @WangGuangxin in #4771
- [GLUTEN-4860][CELEBORN] Replace celeborn link by @kerwin-zk in #4861
- [VL][CI] Fix CI failure related to Celeborn by @PHILO-HE in #4862
- [CORE] Support In list option contains non-foldable expression by @ulysses-you in #4843
- [VL] Daily Update Velox Version (2024_03_05) by @GlutenPerfBot in #4852
- [VL] Enable more tests in GlutenParquetQuerySuite for Spark32/33/34 by @Yohahaha in #4854
- [CORE] ColumnarShuffleExchangeExec should respect advisoryPartitionSize for Spark3.5 by @ulysses-you in #4865
- [GLUTEN-4853][CORE] Only trim Alias when its child is semantically equal to resAttr by @liujiayi771 in #4857
- [VL] minor change for delta ut by @zhli1142015 in #4869
- [VL] Add libsodium.so to thirdparty lib for CentOS8 by @kerwin-zk in #4870
- [VL] Updated documentation, refactoring and added more testcases for BNLJ by @Surbhi-Vijay in #4782
- [VL] Daily Update Velox Version (2024_03_06) by @GlutenPerfBot in #4868
- [MINOR] Remove ExtendedAnalysisException by @PHILO-HE in #4864
- [GLUTEN-4831][VL] Support StructType in HashAggregate by @WangGuangxin in #4832
- [VL] Support inline function by @marin-ma in #4847
- [VL] Add flushable decimal sum test case by @liujiayi771 in #4871
- [CORE] Add synchronized for ExplainUtils processPlan by @ulysses-you in #4876
- [VL] Rewrite collect_set and collect_list aggregate function by @ulysses-you in #4805
- [VL] Fix and use flattenVector by @marin-ma in #4783
- [VL] Enable tests of ParquetPartitionDisconverySuite for Spark33/34 by @Yohahaha in #4881
- [CORE] Minor adjustment to columnar rule list, and move all columnar sub-rules to one source folder by @zhztheplayer in #4863
- [VL] Merge Partial and PartialMerge logic in generateMergeCompanionNode by @liujiayi771 in #4883
- [CORE] Fix Spark-3.5 CI by @ulysses-you in #4886
- [GLUTEN-4424][CORE] Follow up upgrading spark version to 3.5.1 by @JkSelf in #4845
- Add .asf.yml by @yaooqinn in #4892
- Update Vulnerability Handling Process by @yaooqinn in #4894
- [VL] Daily Update Velox Version (2024_03_07) by @GlutenPerfBot in #4877
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240308) by @lwz9103 in #4890
- [CORE] ColumnarBroadcastExchangeExec should set/cancel with job tag for Spark3.5 by @ulysses-you in #4882
- [VL] Daily Update Velox Version (2024_03_08) by @GlutenPerfBot in #4895
- [VL] Pass partition id to velox functions by @zhli1142015 in #4344
- Add Incubation Standard Disclaimer by @yaooqinn in #4911
- [GLUTEN-4835][CORE] Match metric names with Spark by @clee704 in #4834
- [Gluten-4732][CH] delta-mergetree support update/delete/upsert/insert in a more native delta way by @binmahone in #4733
- [GLUTEN-4898][CH]Bug fix to date diff by @KevinyhZou in #4900
- [VL] Daily Update Velox Version(2024_03_11) by @GlutenPerfBot in #4908
- [DOC] Update release & configuration doc by @PHILO-HE in #4910
- [VL] Support lead window function by @ulysses-you in #4902
- [VL] Fix protobuf configure arguments in get_velox.sh by @liujiayi771 in #4920
- [Gluten-4918][CH]support CTAS for clickhouse table by @binmahone in #4919
- [GLUTEN-4926][CELEBORN]
CelebornShuffleManager
should removeshuffleId
fromcolumnarShuffleIds
after unregistering shuffle by @SteNicholas in #4927 - [Gluten-4912][CH]Support Specifying columns in clickhouse tables to b… by @binmahone in #4925
- [Gluten-4706] [CH][CORE] Add a mode to execute count distinct directly instead o… by @binmahone in #4708
- [VL] Daily Update Velox Version (2024_03_12) by @GlutenPerfBot in #4923
- [GLUTEN-4914][CH] Fix exceptions in ASTParser by @taiyang-li in #4916
- [DOC] Minor fix for wrong gluten folder used in doc by @leoluan2009 in #4938
- [VL] Refine log plan/split json into one line by @Yohahaha in #4934
- [VL] Support posexplode function and code refactoring on GenerateExecTransformer by @marin-ma in #4901
- [CORE] Prior to #4893, add vanilla Spark's original scan source code to keep git history by @zhztheplayer in #4931
- [VL] Fix wrong plan equality due to case class inheritance by @zhztheplayer in #4893
- [GLUTEN-3559][VL] enable more sql query tests for Spark34 by @zhouyuan in #4880
- [VL] Daily Update Velox Version (2024_03_13) by @GlutenPerfBot in #4944
- [VL]Bucket join support for Iceberg tables by @SinghAsDev in #4859
- [GLUTEN-4827][UT] Add Golden Files for TPC-H Spark34 + Gluten Execution Plan by @zwangsheng in #4828
- [VL] Verify unhex has been offloaded to native successfully by @Yohahaha in #4937
- [VL] Support skewness aggregate function by @liujiayi771 in #4939
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20240314) by @lwz9103 in #4948
- [VL] parquet file metadata columns support in velox by @gaoyangxiaozhu in #3870
- [VL] Daily Update Velox Version (2024_03_14) by @GlutenPerfBot in #4949
- [VL] Untangle code of TransformPreOverrides by @zhztheplayer in ht...
v1.1.1
Release Notes - Gluten - Version 1.1.1
We are pleased to announce that Gluten has been accepted as an Apache Incubating project. Additionally, we are excited to unveil the release of Gluten-1.1.1. This version marks the final release before our transition to Apache.
Highlights (Velox backend only)
- Support Spark 3.2, 3.3, and 3.4(API only)
- Support 30 common Spark Operators
- Support 220 common Spark Functions
- Velox codebase updated to 2024/02/29
- Refactor Data Lake API to support Delta Lake Scan and Iceberg read COW table
- Better S3, GCS support
- More stability in Spill support
- Enhance metric support for spill, shuffle, and additional metrics.
- Enhance fallback case support by expanding coverage for missing cases and updating messages accordingly
- Enhance Shuffle including merge before compressing, push based shuffle, and more
- More Bug Fixing
What's Changed
- [GLUTEN-3855][VL] Fix ORC related failed UT by @chenxu14 in #3805
- [VL] Support IsNull filter pushdown by @rui-mo in #3791
- [VL] Update velox-backend-limitations.md by @FelixYBW in #3639
- [GLUTEN-2169][VL] Enable GlutenEnsureRequirementsSuite in unit tests by @JkSelf in #3860
- [CH] Fix exception of pb MessageToJsonString by @exmy in #3823
- [GLUTTEN-3851][VL] Add remaining filter time metric by @zhli1142015 in #3852
- [VL] Support ignoreNulls for NthValue window function by @PHILO-HE in #3857
- [VL] Enable using static link for QAT by @marin-ma in #3863
- [VL] Fix assertion failures when mixing use of partial aggregation spilling and flushing by @zhztheplayer in #3872
- [GLUTEN-3796][VL][FOLLOW_UP] Correct test name match and move black list to exclude in
VeloxTestSettings
by @zwangsheng in #3874 - [GLUTEN-3528][VL] Construct unique & non-overlapping partition/sort keys for window operator by @PHILO-HE in #3883
- [GLUTEN-3879][CH] salt 1% of TPCH-1 data to NULL instead of 10% by @binmahone in #3880
- [VL] Doc refresh by @zhouyuan in #3882
- [GLUTEN-3865][CH] Refactor aggregating without keys by @lgbo-ustc in #3866
- [GLUTEN-3722][CH] Improve shuffle writer by @taiyang-li in #3728
- [VL] Map date_format to a Velox function name by @PHILO-HE in #3878
- [VL]Daily Update Velox Version (20231129) by @yma11 in #3877
- [CORE] Add InputIteratorTransformer to decouple ReadRel and iterator index by @ulysses-you in #3854
- [GLUTEN-3732][VL] Use arrow result-returning variants
FileWriter::Open
API by @yangzhg in #3733 - [CORE] Move validate methods from TransformerApi to ValidatorApi by @exmy in #3881
- [GLUTEN-3824][CH]Bug fix hdfs path contains space by @KevinyhZou in #3825
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20231201) by @lwz9103 in #3898
- [VL] Break up spilling operation to two phases: shrink phase and spill phase by @zhztheplayer in #3895
- [GLUTEN-1699][VL] Support loadLibFromJar on RedHat 7/8 by @ychris78 in #3893
- [GLUTEN-3906] [VL] fix: fix package.sh failed for x86 by @lzjqsdd in #3907
- [GLUTEN-3750][CH]Bug fix json parse error by @KevinyhZou in #3751
- [GLUTEN-3902][VL] Add documentation to configure the Velox+GCS connector by @tigrux in #3902
- [DOC] Revise Gluten document by @PHILO-HE in #3892
- [VL]Daily Update Velox Version (20231203) by @yma11 in #3913
- [VL] Minor improvements for CI stale bot by @zhztheplayer in #3888
- [VL] Avoid reapplying code patches for external projects when ENABLE_EP_CACHE=ON by @zhztheplayer in #3916
- [VL] minor change for fallback log by @zhli1142015 in #3919
- [VL] Add sort merge join metrics by @ulysses-you in #3920
- [GLUTEN-3378][CORE] Datasource V2 data lake read support by @liujiayi771 in #3843
- [VL] ENABLE_EP_CACHE=ON still uses cached Velox build although the build arguments were changed by @zhztheplayer in #3926
- [VL] Make bloom_filter_agg fall back when might_contain is not transformable by @zhli1142015 in #3917
- [VL][CI] update docker build script by @zhouyuan in #3904
- [GLUTEN-3917][FOLLOWUP] Add back SparkShimLoader import by @ulysses-you in #3940
- [VL] Fix VeloxTPCHV1BhjSuite and VeloxTPCHV2Suite useV1SourceList by @liujiayi771 in #3930
- [VL] Fix syntax error in stale.yml by @zhztheplayer in #3945
- [GLUTEN-3854][CORE][FOLLOWUP] Add ColumnarInputAdapter back to recover UI graph by @ulysses-you in #3933
- [GLUTEN-1632][CH]Daily Update Clickhouse Version (20231206) by @lwz9103 in #3938
- [VL] Add output row metric for InputIteratorTransformer by @Yohahaha in #3939
- [GLUTEN-3927][CH] Improve the performance of element_at by @taiyang-li in #3928
- [GLUTEN-3908][CH] Improve shuffle split for clickhouse backend by remove ColumnNullable's
memcmp
by @KevinyhZou in #3909 - [GLUTEN-3924][CORE] Match hive UDF name in case-insensitive mode during expression transformation by @taiyang-li in #3925
- [GLUTEN-3958] Use getDeclaredConstructor().newInstance() in ScanTransformerFactory by @liujiayi771 in #3961
- [GLUTEN-3944][CH]Fix gluten.jar with delta20 when use spark 3.3 by @lwz9103 in #3947
- [VL] gluten-te: In dockerfiles, use symbolic link for /opt/velox by @zhztheplayer in #3946
- [VL]Daily Update Velox Version (20231206) by @yma11 in #3954
- Revert "[GLUTEN-3908][CH] Improve shuffle split for clickhouse backend by remove ColumnNullable's
memcmp
" by @baibaichen in #3965 - [GLUTEN-3890][CH] Respect spill_threshold for all buffers in shuffle writer by @taiyang-li in #3891
- [CORE] Fix wrong fallback cost by @ulysses-you in #3967
- [GLUTEN-3922][CH] Fix incorrect shuffle hash id value when executing modulo by @zzcclp in #3923
- [VL] quick fix for static build git conflict by @zhouyuan in #3971
- [GLUTEN-3486][CH] Fix AQE cannot coalesce shuffle partitions by @exmy in #3941
- [GLUTEN-3949][CH] Merge small blocks from upstream phase into a large one by @lgbo-ustc in #3952
- [GLUTEN-3948][CH] Fix exception and diff of trunc function by @exmy in #3968
- [GLUTEN-3979][CORE] Use exists() instead of map().exists() to improve code readability by @dcoliversun in #3980
- [VL]Daily Update Velox Version (20231208) by @yma11 in #3973
- Revert "[VL] Make bloom_filter_agg fall back when might_contain is not transformable (#3917)" by @loneylee in #3977
- [GLUTEN-3580][VL] support read data from abfs with account key by @gaoyangxiaozhu in #3897
- [GLUTEN-3991][CH] Fix the incorrect display name for the mergetree file format by @zzcclp in #3992
- [VL] gluten-te: Enable BuildKit to support --cache-from by @zhztheplayer in #3964
- [GLUTEN-3841][CH] Support spill in 2nd aggregate stage by @lgbo-ustc in #3772
- [VL] Daily Update Velox Version (20231211) by @zhztheplayer in #3999
- [VL] Fix StringToMap test failure by @PHILO-HE in #3995
- [VL] Make bloom_filter_agg fall back when might_contain is not transformable by @zhli1142015 in #3994
- [VL] Following #3996, fix CI error "Runtime factory already registered" by @zhztheplayer in #4001
- [VL] Fix linking simdjson error when building benchmark by @PHILO-HE in #3960
- [GLUTEN-4002][CH] Update InputIteratorTransformer metrics by @zzcclp in https://github.com/...