Detect multiple jars on the classpath when init plugin [databricks] #9654

thirtiseven · 2023-11-07T08:29:17Z

This PR detects if there are multiple plugin/cudf/jni jars on the classpath when initializing RapidsDriverPlugin and RapidsExecutorPlugin and throws an exception if so. This PR also added a config to allow this behavior.

Note that this PR did not accommodate for the fact that we might get a duplication spread over user and system classloaders. So one in a system classloader via extraClassPath or because it's in $SPARK_HOME/jars, and another one via spark.jars. I will try to check this or file a follow-on issue on this.

Note that the exception will only be thrown if the first plugin jar on the classpath is after this PR, so the same issue may occur again in the short term.

Tests

$ spark-shell --master local[*] --jars $(find ~/spark-rapids/dist/ -name '*spark*.jar' | xargs printf "%s,") --conf spark.plugins=com.nvidia.spark.SQLPlugin --conf spark.rapids.sql.allowMultipleJars=true

23/11/15 15:41:00 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/11/15 15:41:05 WARN RapidsPluginUtils: RAPIDS Accelerator 23.12.0-SNAPSHOT using cudf 23.12.0-SNAPSHOT.
23/11/15 15:41:05 WARN RapidsPluginUtils: Multiple rapids-4-spark jars with different revision found in the classpath:
revison: ded091e8777a137b41964c6aad4953421f8ca7ad
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=ded091e8777a137b41964c6aad4953421f8ca7ad
branch=detect_multiple_jars
date=2023-11-09T14:00:37Z
url=https://github.com/thirtiseven/spark-rapids.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=ded091e8777a137b41964c6aad4953421f8ca7ad
branch=detect_multiple_jars
date=2023-11-09T14:00:37Z
url=https://github.com/thirtiseven/spark-rapids.git
revison: 3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
branch=detect_multiple_jars
date=2023-11-15T07:31:16Z
url=https://github.com/thirtiseven/spark-rapids.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/deps/rapids-4-spark-aggregator_2.12-23.12.0-SNAPSHOT-spark341.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
branch=detect_multiple_jars
date=2023-11-15T07:31:16Z
url=https://github.com/thirtiseven/spark-rapids.git

Please make sure there is only one rapids-4-spark jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
23/11/15 15:41:05 WARN RapidsPluginUtils: Multiple spark-rapids-jni jars with different revision found in the classpath:
revison: a87a4039372b0b1e0bca596866c5cedbe1e0a845
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
user=
revision=a87a4039372b0b1e0bca596866c5cedbe1e0a845
branch=float_to_string
date=2023-11-09T01:13:08Z
url=https://github.com/thirtiseven/spark-rapids-jni.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
user=
revision=a87a4039372b0b1e0bca596866c5cedbe1e0a845
branch=float_to_string
date=2023-11-09T01:13:08Z
url=https://github.com/thirtiseven/spark-rapids-jni.git
revison: 9685ff68547b98f81c9b245478bcec2cdfb237e1
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
user=
revision=9685ff68547b98f81c9b245478bcec2cdfb237e1
branch=HEAD
date=2023-11-14T05:13:43Z
url=https://github.com/NVIDIA/spark-rapids-jni.git

Please make sure there is only one spark-rapids-jni jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
23/11/15 15:41:05 WARN RapidsPluginUtils: Multiple cudf jars with different revision found in the classpath:
revison: b3e32795bbcb575c526364e77edee3a44aa1ff3f
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
user=
revision=b3e32795bbcb575c526364e77edee3a44aa1ff3f
branch=HEAD
date=2023-11-09T01:13:08Z
url=https://github.com/rapidsai/cudf.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
user=
revision=b3e32795bbcb575c526364e77edee3a44aa1ff3f
branch=HEAD
date=2023-11-09T01:13:08Z
url=https://github.com/rapidsai/cudf.git
revison: 4313cfa9b3fcff41f67b48ac8797dc015d441ecc
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
user=
revision=4313cfa9b3fcff41f67b48ac8797dc015d441ecc
branch=HEAD
date=2023-11-14T05:13:43Z
url=https://github.com/rapidsai/cudf.git

Please make sure there is only one cudf jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
23/11/15 15:41:05 WARN RapidsPluginUtils: RAPIDS Accelerator is enabled, to disable GPU support set `spark.rapids.sql.enabled` to false.
23/11/15 15:41:05 WARN RapidsPluginUtils: spark.rapids.sql.explain is set to `NOT_ON_GPU`. Set it to 'NONE' to suppress the diagnostics logging about the query placement on the GPU.
23/11/15 15:41:11 WARN RapidsPluginUtils: Multiple rapids-4-spark jars with different revision found in the classpath:
revison: ded091e8777a137b41964c6aad4953421f8ca7ad
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=ded091e8777a137b41964c6aad4953421f8ca7ad
branch=detect_multiple_jars
date=2023-11-09T14:00:37Z
url=https://github.com/thirtiseven/spark-rapids.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=ded091e8777a137b41964c6aad4953421f8ca7ad
branch=detect_multiple_jars
date=2023-11-09T14:00:37Z
url=https://github.com/thirtiseven/spark-rapids.git
revison: 3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
branch=detect_multiple_jars
date=2023-11-15T07:31:16Z
url=https://github.com/thirtiseven/spark-rapids.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/deps/rapids-4-spark-aggregator_2.12-23.12.0-SNAPSHOT-spark341.jar
version=23.12.0-SNAPSHOT
cudf_version=23.12.0-SNAPSHOT
user=haoyangl
revision=3dc12e49ffb194eff7dcc0d00cde3d81f17d01e1
branch=detect_multiple_jars
date=2023-11-15T07:31:16Z
url=https://github.com/thirtiseven/spark-rapids.git

Please make sure there is only one rapids-4-spark jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
23/11/15 15:41:11 WARN RapidsPluginUtils: Multiple spark-rapids-jni jars with different revision found in the classpath:
revison: a87a4039372b0b1e0bca596866c5cedbe1e0a845
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
user=
revision=a87a4039372b0b1e0bca596866c5cedbe1e0a845
branch=float_to_string
date=2023-11-09T01:13:08Z
url=https://github.com/thirtiseven/spark-rapids-jni.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
user=
revision=a87a4039372b0b1e0bca596866c5cedbe1e0a845
branch=float_to_string
date=2023-11-09T01:13:08Z
url=https://github.com/thirtiseven/spark-rapids-jni.git
revison: 9685ff68547b98f81c9b245478bcec2cdfb237e1
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
user=
revision=9685ff68547b98f81c9b245478bcec2cdfb237e1
branch=HEAD
date=2023-11-14T05:13:43Z
url=https://github.com/NVIDIA/spark-rapids-jni.git

Please make sure there is only one spark-rapids-jni jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
23/11/15 15:41:11 WARN RapidsPluginUtils: Multiple cudf jars with different revision found in the classpath:
revison: b3e32795bbcb575c526364e77edee3a44aa1ff3f
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda113.jar
version=23.12.0-SNAPSHOT
user=
revision=b3e32795bbcb575c526364e77edee3a44aa1ff3f
branch=HEAD
date=2023-11-09T01:13:08Z
url=https://github.com/rapidsai/cudf.git
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda112.jar
version=23.12.0-SNAPSHOT
user=
revision=b3e32795bbcb575c526364e77edee3a44aa1ff3f
branch=HEAD
date=2023-11-09T01:13:08Z
url=https://github.com/rapidsai/cudf.git
revison: 4313cfa9b3fcff41f67b48ac8797dc015d441ecc
jar URL: jar:file:/home/haoyangl/spark-rapids/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
version=23.12.0-SNAPSHOT
user=
revision=4313cfa9b3fcff41f67b48ac8797dc015d441ecc
branch=HEAD
date=2023-11-14T05:13:43Z
url=https://github.com/rapidsai/cudf.git

Please make sure there is only one cudf jar in the classpath. If it is impossible to fix the classpath you can suppress the error by setting spark.rapids.sql.allowMultipleJars to true.
Spark context Web UI available at http://dev-machine:4040
Spark context available as 'sc' (master = local[*], app id = local-1700034065117).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 3.4.1
      /_/

Using Scala version 2.12.17 (OpenJDK 64-Bit Server VM, Java 1.8.0_382)
Type in expressions to have them evaluated.
Type :help for more information.

scala>

Signed-off-by: Haoyang Li <[email protected]>

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

Co-authored-by: Gera Shegalov <[email protected]>

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

Signed-off-by: Haoyang Li <[email protected]>

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov

LGTM

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

gerashegalov · 2023-11-09T13:09:10Z

build

Co-authored-by: Gera Shegalov <[email protected]>

thirtiseven · 2023-11-09T13:10:52Z

build

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

Co-authored-by: Gera Shegalov <[email protected]>

thirtiseven · 2023-11-09T14:13:14Z

build

gerashegalov · 2023-11-09T15:12:06Z

The integration tests fail because we add rapids4spark-version-info.properties to all intermediate jars and the final jars including rapids-4-spark-integration-tests jar which we have not published since 21.12

We can deal with whether we should continue adding version info to the intermediate jars separately.

For a fix:

we could check for public classes instead such as "com/nvidia/spark/SQLPlugin.class"
or filter out URLs containing "test"

I lean to the second option

jlowe · 2023-11-09T15:46:21Z

We publish the integration tests jar internally to aid in nightly test runs. Another approach is to make the properties file in the dist artifact unique relative to intermediate jars. Arguably all of the property files should be unique, even across intermediates, since we sometimes decide to publish those intermediates, even if only internally.

gerashegalov · 2023-11-09T17:33:21Z

Another approach is to make the properties file in the dist artifact unique relative to intermediate jars. Arguably all of the property files should be unique, even across intermediates, since we sometimes decide to publish those intermediates, even if only internally.

This sounds good to me. As for the version file generation it is easy to interpolate ${project.artifactId} into the file name. This will slightly complicate the logic, and require code adjustments in the detection logic and otherwise.

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven · 2023-11-22T12:43:33Z

build

gerashegalov · 2023-11-22T18:24:58Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

+    conf.allowMultipleJars match {
+      case "ALWAYS" =>
+        logWarning(msg)
+      case "SAME_REVISION" =>
+        require(revisionMap.size == 1, msg)
+      case "NEVER" =>
+        require(revisionMap.size == 1 && revisionMap.values.forall(_.size == 1), msg)
+      case _ => 
+        throw new IllegalArgumentException(s"Invalid value for " +
+          s"${RapidsConf.ALLOW_MULTIPLE_JARS.key}: ${conf.allowMultipleJars}. " +
+          s"Valid values are ALWAYS, SAME_REVISION, NEVER.")
+    }


let us follow the pattern of other configs with an enum. Take a look PARQUET_READER_FOOTER_TYPE and make an auxiliary function translating strings <-> enum.

gerashegalov · 2023-11-22T18:28:25Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala

+      "values are ALWAYS: allow all jars, SAME_REVISION: only allow jars with the same " +
+      "revision, NEVER: do not allow multiple jars at all.")
+    .stringConf
+    .createWithDefault("SAME_REVISION")


I still think NEVER should be the default, and SAME_REVISION should only be set in the tests where it is hard to fix right away with an issue filed to get it fixed later.

Ok, will file a follow-on issue on it.

What are the possible issues if there are multiple jars of the same revision on the classpath?

Here are some problems going undetected:

it does not detect when the dist jar is in multiple extraClassPath, $SPARK_HOME/jars, and on the user classpath via packlages and jars. We have seen bugs in such deployments [BUG] RapidsShuffleManager and external packages for Spark Standalone #5796

when you have sql-plugin and aggregator on the classpath at the same time the user might be using an unintended flatbffers dependency, the aggregator relocates flatbuffers to an internal package through the shade plugin to make sure it uses just the one from the aggregator.

It is fine to have a separate issue to sort the right default value out.

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov

LGTM

gerashegalov · 2023-11-23T20:51:59Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

+    }
+    val revisionMap: Map[String, Seq[URL]] = possibleRapidsJarURLs.map { url =>
+      val versionInfo = scala.io.Source.fromURL(url).getLines().toSeq
+      lazy val revision = versionInfo


nit: no point in lazy

Suggested change

lazy val revision = versionInfo

val revision = versionInfo

gerashegalov · 2023-11-23T20:55:42Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

+        .filter(_.startsWith("revision="))
+        .map(_.split("=").last)


if you define a regex

val revisionRegex = "revision=(.*)".r

you could do it in one shot

Suggested change

.filter(_.startsWith("revision="))

.map(_.split("=").last)

.collect {

case revisionRegex(revision) => revision

}

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven · 2023-11-24T01:36:31Z

build

gerashegalov

At the minimum, retest the functionality manually. Bonus points: add unit tests

NEVER fails for me with a single jar

SPARK_HOME=~/dist/spark-3.3.0-bin-hadoop3 \
  ~/dist/spark-3.3.0-bin-hadoop3/bin/spark-shell \
  --jars dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar \
  --conf spark.plugins=com.nvidia.spark.SQLPlugin \
  --conf spark.rapids.sql.allowMultipleJars=NEVER

java.lang.IllegalArgumentException: requirement failed: Multiple rapids-4-spark jars found in the classpath:
revison: f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        jar URL: jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
        version=23.12.0-SNAPSHOT
        cudf_version=23.12.0-SNAPSHOT
        user=gshegalov
        revision=f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        branch=pr/thirtiseven/9654
        date=2023-11-24T03:03:31Z
        url=https://github.com/NVIDIA/spark-rapids
        jar URL: jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
        version=23.12.0-SNAPSHOT
        cudf_version=23.12.0-SNAPSHOT
        user=gshegalov
        revision=f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        branch=pr/thirtiseven/9654
        date=2023-11-24T03:03:31Z
        url=https://github.com/NVIDIA/spark-rapids

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven · 2023-11-24T05:56:16Z

At the minimum, retest the functionality manually. Bonus points: add unit tests

NEVER fails for me with a single jar

SPARK_HOME=~/dist/spark-3.3.0-bin-hadoop3 \
  ~/dist/spark-3.3.0-bin-hadoop3/bin/spark-shell \
  --jars dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar \
  --conf spark.plugins=com.nvidia.spark.SQLPlugin \
  --conf spark.rapids.sql.allowMultipleJars=NEVER

java.lang.IllegalArgumentException: requirement failed: Multiple rapids-4-spark jars found in the classpath:
revison: f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        jar URL: jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
        version=23.12.0-SNAPSHOT
        cudf_version=23.12.0-SNAPSHOT
        user=gshegalov
        revision=f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        branch=pr/thirtiseven/9654
        date=2023-11-24T03:03:31Z
        url=https://github.com/NVIDIA/spark-rapids
        jar URL: jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar
        version=23.12.0-SNAPSHOT
        cudf_version=23.12.0-SNAPSHOT
        user=gshegalov
        revision=f5bd0c958c1afc86a3e7906684809fc62e21d8ab
        branch=pr/thirtiseven/9654
        date=2023-11-24T03:03:31Z
        url=https://github.com/NVIDIA/spark-rapids

It's weird, I can't reproduce this with 330/341 and with or without upmerge.

gerashegalov · 2023-11-24T19:35:20Z

It's weird, I can't reproduce this with 330/341 and with or without upmerge.

@thirtiseven it looks like a single shim jar does not reproduce the issue indeed.

my reproducer involves a two-shim jar built with:

./build/buildall --profile=330,350

on your PR branch

the problem is related to the fact that we are currently sloppy with the inclusion of the properties files

7d11a4ff004e53dac97819a387007b2497e2819e -> Vector(
  jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar!/rapids4spark-version-info.properties, 
  jar:file:/home/gshegalov/gits/NVIDIA/spark-rapids.worktrees/reviews/dist/target/rapids-4-spark_2.12-23.12.0-SNAPSHOT-cuda11.jar!/spark330/rapids4spark-version-info.properties
)

Let us consider only the files stored under !/, and ignore subdirs for simplicity

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven · 2023-11-27T07:21:09Z

@gerashegalov Thank you! I can reproduce now.
Just updated the code to filter out subdirs' files and submodules and tested again. Trying to add UT

gerashegalov · 2023-11-27T20:45:16Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

+    val possibleRapidsJarURLs = classloader.getResources(propName).asScala.toSet.toSeq.filter {
+      url => {
+        val urlPath = url.toString
+        // filter out submodule jars and files stored under subdirs of '!/'
+        !urlPath.contains("rapids-4-spark-") && urlPath.contains("!/" + propName)
+      }
+    }


Not too easy to follow but I get it now. In the comment can you add an example of the submodule vs main dist artifact you are looking for.

However, now it looks like we NEVER will never find out where we put submodule jars for tests by accident?

However, now it looks like we NEVER will never find out where we put submodule jars for tests by accident?

How about adding another config mode like RECURISIVE to detect all jars on classpath and print a warning log? Just need to skip this filter under that mode.

What do you mean by RECURSIVE?

if RECURSIVE means going down a directory level inside a jar file, I think we already established that we should only look at the root entries
ending with "!/" + propName

if RECURSIVE means including the check to all sub-module artitfact id prefixed with rapids-4-spark- I realize that we do not have a good story now that change of the version info file name to contain the artifact Id is not part of the PR. Let us defer this to the followup issue you were going to file.

if RECURSIVE means including the check to all sub-module artitfact id prefixed with rapids-4-spark- I realize that we do not have a good story now that change of the version info file name to contain the artifact Id is not part of the PR. Let us defer this to the followup issue you were going to file.

Yes, I mean this way. I will add it to the follow-up issue too.

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

Co-authored-by: Gera Shegalov <[email protected]>

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov

LGTM, thanks for working through the review issues.

I believe we have a very useful MVP already, and it can be improved in the follow-up PRs.

thirtiseven · 2023-11-28T03:26:10Z

Thanks for your patient review and useful suggestions @gerashegalov

thirtiseven · 2023-11-28T03:26:14Z

build

thirtiseven · 2023-11-28T09:16:35Z

Follow on issue: #9870

Detect multiple jars on the classpath when init plugin

2f1fae2

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven self-assigned this Nov 7, 2023

thirtiseven requested review from gerashegalov and revans2 November 7, 2023 08:29

clean up

0c5df62

Signed-off-by: Haoyang Li <[email protected]>

sameerz added the reliability Features to improve reliability or bugs that severly impact the reliability of the plugin label Nov 7, 2023

gerashegalov reviewed Nov 7, 2023

View reviewed changes

Apply suggestions from code review

742b508

Co-authored-by: Gera Shegalov <[email protected]>

gerashegalov reviewed Nov 8, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Outdated Show resolved Hide resolved

thirtiseven added 4 commits November 8, 2023 16:00

print version info and also check jni/cudf

47ef387

Signed-off-by: Haoyang Li <[email protected]>

print version info and also check jni/cudf

171c016

Signed-off-by: Haoyang Li <[email protected]>

add config for allowing multiple jars

e29e37d

Signed-off-by: Haoyang Li <[email protected]>

keep jar path in error messages

0441f53

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov reviewed Nov 8, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Outdated Show resolved Hide resolved

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Outdated Show resolved Hide resolved

address comments

ded091e

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov previously approved these changes Nov 9, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Outdated Show resolved Hide resolved

gerashegalov changed the title ~~Detect multiple jars on the classpath when init plugin~~ Detect multiple jars on the classpath when init plugin [databricks] Nov 9, 2023

Update sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

42fc474

Co-authored-by: Gera Shegalov <[email protected]>

thirtiseven dismissed gerashegalov’s stale review via 42fc474 November 9, 2023 13:10

gerashegalov requested changes Nov 9, 2023

View reviewed changes

sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala Outdated Show resolved Hide resolved

Update sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

6bbedfb

Co-authored-by: Gera Shegalov <[email protected]>

gerashegalov previously approved these changes Nov 9, 2023

View reviewed changes

gerashegalov self-requested a review November 9, 2023 18:40

address comments

d9264d8

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven requested a review from gerashegalov November 22, 2023 13:01

gerashegalov requested changes Nov 22, 2023

View reviewed changes

thirtiseven added 2 commits November 23, 2023 11:09

check values for config

c7ca0ad

Signed-off-by: Haoyang Li <[email protected]>

use enum

a833c73

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven requested a review from gerashegalov November 23, 2023 03:29

gerashegalov previously approved these changes Nov 23, 2023

View reviewed changes

fix two nits

f5bd0c9

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven dismissed gerashegalov’s stale review via f5bd0c9 November 24, 2023 01:36

gerashegalov requested changes Nov 24, 2023

View reviewed changes

thirtiseven added 2 commits November 24, 2023 13:18

Merge branch 'branch-23.12' into detect_multiple_jars

405bf05

Do not print log if no multiple jar

7d11a4f

Signed-off-by: Haoyang Li <[email protected]>

thirtiseven changed the base branch from branch-23.12 to branch-24.02 November 27, 2023 04:26

ignore subdir when checking multiple jars

c4eef25

Signed-off-by: Haoyang Li <[email protected]>

gerashegalov reviewed Nov 27, 2023

View reviewed changes

thirtiseven and others added 4 commits November 28, 2023 09:51

Update sql-plugin/src/main/scala/com/nvidia/spark/rapids/Plugin.scala

bf4477a

Co-authored-by: Gera Shegalov <[email protected]>

wip ut

39e0b56

address comment

c3279f4

Signed-off-by: Haoyang Li <[email protected]>

Merge branch 'branch-24.02' into detect_multiple_jars

70b1247

gerashegalov approved these changes Nov 28, 2023

View reviewed changes

thirtiseven merged commit 6f96048 into NVIDIA:branch-24.02 Nov 28, 2023
36 checks passed

thirtiseven mentioned this pull request Nov 28, 2023

[FEA] Set spark.rapids.sql.allowMultipleJars to NEVER by default #9870

Open

gerashegalov mentioned this pull request Dec 1, 2023

[BUG] Unable to use Spark Rapids with Spark Thrift Server #9867

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detect multiple jars on the classpath when init plugin [databricks] #9654

Detect multiple jars on the classpath when init plugin [databricks] #9654

thirtiseven commented Nov 7, 2023 •

edited

Loading

gerashegalov left a comment

gerashegalov commented Nov 9, 2023

thirtiseven commented Nov 9, 2023

thirtiseven commented Nov 9, 2023

gerashegalov commented Nov 9, 2023

jlowe commented Nov 9, 2023

gerashegalov commented Nov 9, 2023

thirtiseven commented Nov 22, 2023

gerashegalov Nov 22, 2023

thirtiseven Nov 23, 2023

gerashegalov Nov 22, 2023

thirtiseven Nov 23, 2023

gerashegalov Nov 23, 2023 •

edited

Loading

gerashegalov left a comment

gerashegalov Nov 23, 2023

gerashegalov Nov 23, 2023

thirtiseven commented Nov 24, 2023

gerashegalov left a comment

thirtiseven commented Nov 24, 2023

gerashegalov commented Nov 24, 2023 •

edited

Loading

thirtiseven commented Nov 27, 2023

gerashegalov Nov 27, 2023

thirtiseven Nov 28, 2023 •

edited

Loading

gerashegalov Nov 28, 2023

thirtiseven Nov 28, 2023

gerashegalov left a comment

thirtiseven commented Nov 28, 2023

thirtiseven commented Nov 28, 2023

thirtiseven commented Nov 28, 2023

Detect multiple jars on the classpath when init plugin [databricks] #9654

Detect multiple jars on the classpath when init plugin [databricks] #9654

Conversation

thirtiseven commented Nov 7, 2023 • edited Loading

gerashegalov left a comment

Choose a reason for hiding this comment

gerashegalov commented Nov 9, 2023

thirtiseven commented Nov 9, 2023

thirtiseven commented Nov 9, 2023

gerashegalov commented Nov 9, 2023

jlowe commented Nov 9, 2023

gerashegalov commented Nov 9, 2023

thirtiseven commented Nov 22, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gerashegalov Nov 23, 2023 • edited Loading

Choose a reason for hiding this comment

gerashegalov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thirtiseven commented Nov 24, 2023

gerashegalov left a comment

Choose a reason for hiding this comment

thirtiseven commented Nov 24, 2023

gerashegalov commented Nov 24, 2023 • edited Loading

thirtiseven commented Nov 27, 2023

Choose a reason for hiding this comment

thirtiseven Nov 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gerashegalov left a comment

Choose a reason for hiding this comment

thirtiseven commented Nov 28, 2023

thirtiseven commented Nov 28, 2023

thirtiseven commented Nov 28, 2023

thirtiseven commented Nov 7, 2023 •

edited

Loading

gerashegalov Nov 23, 2023 •

edited

Loading

gerashegalov commented Nov 24, 2023 •

edited

Loading

thirtiseven Nov 28, 2023 •

edited

Loading