Use GCC 13 in CUDA 12+ builds #129

bdice · 2024-12-23T21:33:55Z

CUDA 12.5 added support for GCC 13. Recently, conda-forge began using GCC 13 for CUDA 12 builds (conda-forge/conda-forge-pinning-feedstock#6736, conda-forge/conda-forge-pinning-feedstock#6849).

This issue proposes using GCC 13 for CUDA 12 builds of RAPIDS, to align with conda-forge.

One proposal for implementation is here: rapidsai/rmm#1773

I propose that we target this update for 25.02, to stay aligned with conda-forge.

jameslamb · 2025-01-02T22:46:21Z

@bdice @robertmaynard and I talked in an offline conversation and decided to only pursue this for conda builds, leaving wheel builds on GCC 11 (which is set here in ci-imgs).

Summarizing...

On the one hand... it'd probably be safe to update to GCC 13. We're producing manylinux_2_28 wheels. The official PyPA manylinux images for doing that that just switched to GCC 14 (!) (pypa/manylinux#1730)... that suggests to me that going to a newer compiler for wheel builds in RAPIDS might be ok. Improved diagnostics in GCC 13 would be helpful for catching issues, and wheel builds do sometimes go down different codepaths than conda builds (for example, because more dependencies are built from source instead of linked to).

On the other hand... switching the host compiler based on CUDA version (as we'd have to stay on GCC 11 for CUDA 11) would be more painful and error-prone for wheel builds than it is for conda. And I can't think of specific reasons that continuing to use GCC 11 for wheel builds while using GCC 13 for conda builds would be problematic.

jameslamb · 2025-01-02T23:01:16Z

One other issue will need to be figured out... unified devcontainers.

The conda devcontainers expect every project to provide compiler pins, like this:

specific:
  - output_types: conda
    matrices:
      - matrix:
          arch: x86_64
        packages:
          - gcc_linux-64=11.*
          - sysroot_linux-64==2.17
      - matrix:
          arch: aarch64
        packages:
          - gcc_linux-aarch64=11.*
          - sysroot_linux-aarch64==2.17
  - output_types: conda
    matrices:
      - matrix:
          arch: x86_64
          cuda: "11.8"
        packages:
          - nvcc_linux-64=11.8
      - matrix:
          arch: aarch64
          cuda: "11.8"
        packages:
          - nvcc_linux-aarch64=11.8
      - matrix:
          cuda: "12.*"
        packages:
          - cuda-nvcc

(rmm/dependencies.yaml)

Which all get merged into a single conda environment in rapids-make-conda-env: https://github.com/rapidsai/devcontainers/blob/e1168d73bcbe5d5c96010471ac2f9accef943592/features/src/rapids-build-utils/opt/rapids-build-utils/bin/post-start-command.sh#L9

As soon as any one of the RAPIDS repos switches to GCC 13, I think the unified conda devcontainers will be broken until all of them are updated (because gcc_linux-64=11.* and gcc_linux-64=13.* are incompatible pins).

Since those matrices in dependencies.yaml only affect local development (not packages built in CI), the best way I can think of to do this could be to make them ranges in each "use GCC 13" PR, like this:

specific:
  - output_types: conda
    matrices:
      - matrix:
          arch: x86_64
          cuda: "11.*"
        packages:
          - gcc_linux-64=11.*
          - &sysroot_x86_64 sysroot_linux-64==2.17
      - matrix:
          arch: x86_64
          cuda: "12.*"
        packages:
          - gcc_linux-64>=11,<14
          - *sysroot_x86_64
      - matrix:
          arch: aarch64
          cuda: "11.*"
        packages:
          - gcc_linux-aarch64=11.*
          - &sysroot_aarch64 sysroot_linux-aarch64==2.17
      - matrix:
          arch: aarch64
          cuda: "12.*"
        packages:
          - gcc_linux-aarch64>=11,<14
          - *sysroot_aarch64

And then, once every project is updated, do one more round of PRs to tighten them:

specific:
  - output_types: conda
    matrices:
      - matrix:
          arch: x86_64
          cuda: "11.*"
        packages:
          - gcc_linux-64=11.*
          - &sysroot_x86_64 sysroot_linux-64==2.17
      - matrix:
          arch: x86_64
          cuda: "12.*"
        packages:
          - gcc_linux-64=13.*
          - *sysroot_x86_64
      - matrix:
          arch: aarch64
          cuda: "11.*"
        packages:
          - gcc_linux-aarch64=11.*
          - &sysroot_aarch64 sysroot_linux-aarch64==2.17
      - matrix:
          arch: aarch64
          cuda: "12.*"
        packages:
          - gcc_linux-aarch64=13.*
          - *sysroot_aarch64

jakirkham · 2025-01-08T20:52:26Z

This argues in favor of having centralized pinnings that we can apply across RAPIDS projects. That could be useful even outside this context

bdice · 2025-01-13T16:08:16Z

bdice mentioned this issue Dec 23, 2024

Use GCC 13 in CUDA 12 conda builds. rapidsai/rmm#1773

Open

jakirkham mentioned this issue Jan 11, 2025

Update Conda recipe's GLIBC to 2.28 #131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use GCC 13 in CUDA 12+ builds #129

Use GCC 13 in CUDA 12+ builds #129

bdice commented Dec 23, 2024 •

edited

Loading

jameslamb commented Jan 2, 2025

jameslamb commented Jan 2, 2025 •

edited by bdice

Loading

jakirkham commented Jan 8, 2025

bdice commented Jan 13, 2025 •

edited

Loading

Use GCC 13 in CUDA 12+ builds #129

Use GCC 13 in CUDA 12+ builds #129

Comments

bdice commented Dec 23, 2024 • edited Loading

jameslamb commented Jan 2, 2025

jameslamb commented Jan 2, 2025 • edited by bdice Loading

jakirkham commented Jan 8, 2025

bdice commented Jan 13, 2025 • edited Loading

bdice commented Dec 23, 2024 •

edited

Loading

jameslamb commented Jan 2, 2025 •

edited by bdice

Loading

bdice commented Jan 13, 2025 •

edited

Loading