Skip to content

Commit

Permalink
cuML dask fixes to unblock CI (#6170)
Browse files Browse the repository at this point in the history
PR includes commits from #6166  to run CI while that one is merged. 

Dask nightlies break a few things in cuML, mainly the following two issues: #6168, #6169

Closes #6166.

Authors:
  - Dante Gama Dessavre (https://github.com/dantegd)
  - Bradley Dice (https://github.com/bdice)
  - Divye Gala (https://github.com/divyegala)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #6170
  • Loading branch information
dantegd authored Dec 11, 2024
1 parent 76b213b commit 858bd9f
Show file tree
Hide file tree
Showing 7 changed files with 20 additions and 14 deletions.
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-118_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ channels:
dependencies:
- c-compiler
- cmake>=3.26.4,!=3.30.0
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-python>=11.8.5,<12.0a0
- cuda-version=11.8
- cudatoolkit
- cudf==25.2.*,>=0.0.0a0
Expand Down
2 changes: 1 addition & 1 deletion conda/environments/all_cuda-125_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- cuda-cudart-dev
- cuda-nvcc
- cuda-profiler-api
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-python>=12.6.2,<13.0a0
- cuda-version=12.5
- cudf==25.2.*,>=0.0.0a0
- cupy>=12.0.0
Expand Down
8 changes: 4 additions & 4 deletions conda/recipes/cuml/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,10 @@ requirements:
- cuda-version ={{ cuda_version }}
{% if cuda_major == "11" %}
- cudatoolkit
- cuda-python >=11.7.1,<12.0a0,<=11.8.3
- cuda-python >=11.8.5,<12.0a0
{% else %}
- cuda-cudart-dev
- cuda-python >=12.0,<13.0a0,<=12.6.0
- cuda-python >=12.6.2,<13.0a0
{% endif %}
- cudf ={{ minor_version }}
- cython >=3.0.0
Expand All @@ -77,10 +77,10 @@ requirements:
- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}
{% if cuda_major == "11" %}
- cudatoolkit
- cuda-python >=11.7.1,<12.0a0,<=11.8.3
- cuda-python >=11.8.5,<12.0a0
{% else %}
- cuda-cudart
- cuda-python >=12.0,<13.0a0,<=12.6.0
- cuda-python >=12.6.2,<13.0a0
{% endif %}
- cudf ={{ minor_version }}
- cupy >=12.0.0
Expand Down
4 changes: 2 additions & 2 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -198,11 +198,11 @@ dependencies:
- matrix:
cuda: "12.*"
packages:
- cuda-python>=12.0,<13.0a0,<=12.6.0
- cuda-python>=12.6.2,<13.0a0
- matrix:
cuda: "11.*"
packages:
- cuda-python>=11.7.1,<12.0a0,<=11.8.3
- cuda-python>=11.8.5,<12.0a0
- matrix:
packages:
- cuda-python
Expand Down
2 changes: 0 additions & 2 deletions python/cuml/cuml/dask/neighbors/kneighbors_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,8 +114,6 @@ def fit(self, X, y):
# Dask-expr does not support numerical column names
# See: https://github.com/dask/dask-expr/issues/1015
_y = y
if hasattr(y, "to_legacy_dataframe"):
_y = y.to_legacy_dataframe()
n_targets = len(_y.columns)
for i in range(n_targets):
uniq_labels.append(_y.iloc[:, i].unique())
Expand Down
6 changes: 5 additions & 1 deletion python/cuml/cuml/tests/dask/test_dask_arr_utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Copyright (c) 2020-2023, NVIDIA CORPORATION.
# Copyright (c) 2020-2024, NVIDIA CORPORATION.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -50,6 +50,10 @@ def test_to_sparse_dask_array(input_type, nrows, ncols, client):

a = cupyx.scipy.sparse.random(nrows, ncols, format="csr", dtype=cp.float32)
if input_type == "dask_dataframe":
pytest.xfail(
reason="Dask nightlies break task fusing for this, "
"issue https://github.com/rapidsai/cuml/issues/6169"
)
df = cudf.DataFrame(a.todense())
inp = dask_cudf.from_cudf(df, npartitions=2)
elif input_type == "dask_array":
Expand Down
10 changes: 7 additions & 3 deletions python/cuml/cuml/tests/dask/test_dask_logistic_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,9 +80,13 @@ def cal_chunks(dataset, n_partitions):
X_da = da.from_array(X_train, chunks=(target_chunk_sizes, -1))
y_da = da.from_array(y_train, chunks=target_chunk_sizes)

X_da, y_da = dask_utils.persist_across_workers(
c, [X_da, y_da], workers=workers
)
# todo (dgd): Dask nightly packages break persisting
# sparse arrays before using them.
# https://github.com/rapidsai/cuml/issues/6168

# X_da, y_da = dask_utils.persist_across_workers(
# c, [X_da, y_da], workers=workers
# )
return X_da, y_da


Expand Down

0 comments on commit 858bd9f

Please sign in to comment.