Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple from shared-workflows #2

Merged
merged 33 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
86fe486
Copy in relevant workflows
vyasr Apr 29, 2024
e64d339
First typo
vyasr Apr 29, 2024
e9739d6
Pass through jq
vyasr Apr 29, 2024
19d4466
Fix outputs
vyasr Apr 29, 2024
d76f438
Specify a node type
vyasr Apr 29, 2024
bf6932d
Remove AWS and sccache since builds are very fast anyway
vyasr Apr 29, 2024
a732600
Set build type
vyasr Apr 29, 2024
4b0ef0f
Set build type
vyasr Apr 29, 2024
2cec740
Revert "Remove AWS and sccache since builds are very fast anyway"
vyasr Apr 29, 2024
f7c04ff
Add permissions
vyasr Apr 30, 2024
928737c
Fix matrix computation for tests
vyasr Apr 30, 2024
e0a59f5
Remove invalid options
vyasr Apr 30, 2024
a1dea41
Clean up unnecessary bits
vyasr Apr 30, 2024
33483ac
Also stop specifying the repository
vyasr Apr 30, 2024
a330f46
Add back token
vyasr Apr 30, 2024
1ed65f9
Handle everything up to the build type
vyasr Apr 30, 2024
9ca047d
Move to a workflow dispatch pattern
vyasr Apr 30, 2024
9448d1b
Add print to test for validation
vyasr Apr 30, 2024
1faf4a8
Add publish step, but disable
vyasr Apr 30, 2024
5eae321
Fix the dependency list
vyasr Apr 30, 2024
9751735
Set package name
vyasr Apr 30, 2024
aadc161
Add build/test prefixes to job names
vyasr Apr 30, 2024
bc712f4
Temporarily enable publication and try out James's new tool
vyasr Apr 30, 2024
3d0a9a8
Address PR feedback
vyasr May 1, 2024
8805326
Increase duration
vyasr May 1, 2024
c58a10f
Fix matrix job name
vyasr May 1, 2024
66ca84c
Try introducing intermediates
vyasr May 1, 2024
34eb8fd
Revert "Try introducing intermediates"
vyasr May 1, 2024
12533b0
Try simplifying to match the build logic
vyasr May 1, 2024
af59154
Explicitly add dependency
vyasr May 1, 2024
b84b8ee
Try fixing publish logic and temporarily remove tests
vyasr May 1, 2024
012f25a
Revert testing changes
vyasr May 1, 2024
90fac24
Fix typo
vyasr May 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 202 additions & 0 deletions .github/workflows/build_and_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,202 @@
name: build_and_test

on:
workflow_call:
inputs:
build_type:
required: true
type: string

permissions:
actions: read
checks: none
contents: read
deployments: none
discussions: none
id-token: write
issues: none
packages: read
pages: none
pull-requests: read
repository-projects: none
security-events: none
statuses: none

jobs:
compute-build-matrix:
runs-on: ubuntu-latest
outputs:
MATRIX: ${{ steps.compute-matrix.outputs.MATRIX }}
steps:
- name: Compute Build Matrix
id: compute-matrix
run: |
set -eo pipefail

export MATRIX="
# amd64
- { ARCH: 'amd64', PY_VER: '3.11', CUDA_VER: '11.8.0', LINUX_VER: 'rockylinux8' }
- { ARCH: 'amd64', PY_VER: '3.11', CUDA_VER: '12.2.2', LINUX_VER: 'rockylinux8' }
# arm64
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '11.8.0', LINUX_VER: 'rockylinux8' }
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '12.2.2', LINUX_VER: 'rockylinux8' }
"

MATRIX="$(
yq -n -o json 'env(MATRIX)' | \
jq -c '{include: .}'
)"

echo "MATRIX=${MATRIX}" | tee --append "${GITHUB_OUTPUT}"
build:
name: build-${{ matrix.CUDA_VER }}, ${{ matrix.ARCH }}, ${{ matrix.LINUX_VER }}
needs: compute-build-matrix
strategy:
matrix: ${{ fromJSON(needs.compute-build-matrix.outputs.MATRIX) }}
runs-on: "linux-${{ matrix.ARCH }}-cpu16"
container:
image: "rapidsai/ci-wheel:cuda${{ matrix.CUDA_VER }}-${{ matrix.LINUX_VER }}-py${{ matrix.PY_VER }}"
env:
RAPIDS_BUILD_TYPE: ${{ inputs.build_type }}
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ vars.AWS_ROLE_ARN }}
aws-region: ${{ vars.AWS_REGION }}
role-duration-seconds: 43200 # 12h
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- name: checkout code repo
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Get current date
id: date
run: |
echo "CURRENT_DATE=$(date --rfc-3339=date)" >> ${GITHUB_ENV}
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- name: Standardize repository information
uses: rapidsai/shared-actions/rapids-github-info@main
with:
repo: ${{ github.repository }}
branch: ${{ github.ref_name }}
date: ${{ steps.date.outputs.date }}
sha: ${{ github.ref_name }}
- name: Build and repair the wheel
run: ci/build_wheel.sh
env:
GH_TOKEN: ${{ github.token }}
RAPIDS_BUILD_TYPE: ${{ inputs.build_type }}
# Use a shell that loads the rc file so that we get the compiler settings
shell: bash -leo pipefail {0}
compute-test-matrix:
vyasr marked this conversation as resolved.
Show resolved Hide resolved
needs: build
runs-on: ubuntu-latest
outputs:
MATRIX: ${{ steps.compute-matrix.outputs.MATRIX }}
steps:
- name: Compute test matrix
id: compute-matrix
run: |
set -eo pipefail

# please keep the matrices sorted in ascending order by the following:
#
# [ARCH, PY_VER, CUDA_VER, LINUX_VER, GPU, DRIVER]
#
export MATRICES="
# amd64
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '11.8.0', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' }
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '12.0.1', LINUX_VER: 'ubuntu22.04', gpu: 'a100', driver: 'latest' }
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '12.2.2', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' }
# arm64
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '11.8.0', LINUX_VER: 'ubuntu22.04', gpu: 'a100', driver: 'latest' }
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '12.0.1', LINUX_VER: 'ubuntu20.04', gpu: 'a100', driver: 'latest' }
- { ARCH: 'arm64', PY_VER: '3.11', CUDA_VER: '12.2.2', LINUX_VER: 'ubuntu22.04', gpu: 'a100', driver: 'latest' }
"

TEST_MATRIX=$(yq -n 'env(MATRICES)')
export TEST_MATRIX

MATRIX="$(
yq -n -o json 'env(TEST_MATRIX)' | \
jq -c '{include: .}'
)"

echo "MATRIX=${MATRIX}" | tee --append "${GITHUB_OUTPUT}"
test:
name: test-${{ matrix.CUDA_VER }}, ${{ matrix.ARCH }}, ${{ matrix.LINUX_VER }}, ${{ matrix.gpu }}
needs: compute-test-matrix
strategy:
matrix: ${{ fromJSON(needs.compute-test-matrix.outputs.MATRIX) }}
runs-on: "linux-${{ matrix.ARCH }}-gpu-${{ matrix.gpu }}-${{ matrix.driver }}-1"
container:
image: "rapidsai/citestwheel:cuda${{ matrix.CUDA_VER }}-${{ matrix.LINUX_VER }}-py${{ matrix.PY_VER }}"
env:
NVIDIA_VISIBLE_DEVICES: ${{ env.NVIDIA_VISIBLE_DEVICES }} # GPU jobs must set this container env variable
RAPIDS_BUILD_TYPE: ${{ inputs.build_type }}
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ vars.AWS_ROLE_ARN }}
aws-region: ${{ vars.AWS_REGION }}
role-duration-seconds: 43200 # 12h
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- name: Run nvidia-smi to make sure GPU is working
run: nvidia-smi
- name: checkout code repo
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Get current date
id: date
run: |
echo "CURRENT_DATE=$(date --rfc-3339=date)" >> ${GITHUB_ENV}
- name: Standardize repository information
uses: rapidsai/shared-actions/rapids-github-info@main
with:
repo: ${{ github.repository }}
branch: ${{ github.ref_name }}
date: ${{ steps.date.outputs.date }}
sha: ${{ github.ref_name }}
- name: Run tests
run: ci/test_wheel.sh
env:
GH_TOKEN: ${{ github.token }}
RAPIDS_BUILD_TYPE: ${{ inputs.build_type }}
publish:
#if: ${{ inputs.build_type == 'branch' }}
needs: test
runs-on: linux-amd64-cpu4
container:
image: "rapidsai/ci-wheel:latest"
env:
RAPIDS_BUILD_TYPE: ${{ inputs.build_type }}
steps:
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ vars.AWS_ROLE_ARN }}
aws-region: ${{ vars.AWS_REGION }}
role-duration-seconds: 43200 # 12h
vyasr marked this conversation as resolved.
Show resolved Hide resolved
- name: checkout code repo
uses: actions/checkout@v4
with:
persist-credentials: false
- name: Get current date
id: date
run: |
echo "CURRENT_DATE=$(date --rfc-3339=date)" >> ${GITHUB_ENV}
- name: Standardize repository information
uses: rapidsai/shared-actions/rapids-github-info@main
with:
repo: ${{ github.repository }}
branch: ${{ github.ref_name }}
date: ${{ steps.date.outputs.date }}
sha: ${{ github.ref_name }}
- name: Download wheels from downloads.rapids.ai and publish to anaconda repository
env:
RAPIDS_CONDA_TOKEN: ${{ secrets.CONDA_RAPIDSAI_WHEELS_NIGHTLY_TOKEN }}
# TODO: This won't currently work because the tool only supports Python
# wheels. That will require an update.
run: |
if [[ ! -d "/tmp/gha-tools" ]]; then
git clone https://github.com/jameslamb/gha-tools.git -b cpp-wheels /tmp/gha-tools
export PATH="/tmp/gha-tools/tools:${PATH}"
fi
rapids-wheels-anaconda ucx cpp
24 changes: 3 additions & 21 deletions .github/workflows/pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,27 +9,9 @@ concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: true

# TODO: I would love to not need RAPIDS shared workflows for these builds, but
# for getting things stood up quickly that's the fastest route.

jobs:
pr-builder:
needs:
- wheel-build-ucx
- wheel-tests-ucx
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
wheel-build-ucx:
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
with:
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
build_type: pull-request
script: ci/build_wheel.sh
wheel-tests-ucx:
needs: wheel-build-ucx
secrets: inherit
uses: rapidsai/shared-workflows/.github/workflows/[email protected]
build_and_test:
uses: ./.github/workflows/build_and_test.yaml
with:
matrix_filter: group_by([.ARCH, (.CUDA_VER|split(".")|map(tonumber)|.[0])]) | map(max_by(.PY_VER|split(".")|map(tonumber)))
build_type: pull-request
script: ci/test_wheel.sh
2 changes: 1 addition & 1 deletion ci/build_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,4 @@ sed -i -E "s/^name = \"${package_name}(.*)?\"$/name = \"${package_name}${PACKAGE
python -m pip wheel "${package_dir}"/ -w "${package_dir}"/dist -vvv --no-deps --disable-pip-version-check

python -m auditwheel repair -w ${package_dir}/final_dist --exclude "libcuda.so.1" --exclude "libnvidia-ml.so.1" --exclude "libucm.so.0" --exclude "libuct.so.0" --exclude "libucs.so.0" --exclude "libucp.so.0" ${package_dir}/dist/*
RAPIDS_PY_WHEEL_NAME="${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 cpp ${package_dir}/final_dist
RAPIDS_PY_WHEEL_NAME="ucx_${RAPIDS_PY_CUDA_SUFFIX}" rapids-upload-wheels-to-s3 cpp ${package_dir}/final_dist
4 changes: 2 additions & 2 deletions ci/test_wheel.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,6 @@ package_name="libucx"

WHEELHOUSE="${PWD}/dist/"
RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
RAPIDS_PY_WHEEL_NAME="${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp "${WHEELHOUSE}"
RAPIDS_PY_WHEEL_NAME="ucx_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp "${WHEELHOUSE}"
python -m pip install "${package_name}-${RAPIDS_PY_CUDA_SUFFIX}" --find-links "${WHEELHOUSE}"
python -c "import libucx; libucx.load_library()"
python -c "import libucx; libucx.load_library(); print('Loaded libucx libraries successfully!')"
Loading