Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise testing infrastructure to decrease spurious failures #759

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
17 changes: 17 additions & 0 deletions .github/workflows/cuda-test.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why we should keep this in a separate yml file?
Is splitting another job not enough?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, not entirely sure I follow. I think having a separate script is necessary to run on another runner? I find this architectural design easier to understand and maintain -- it's fewer pieces and distinct parts are isolated in different files. By splitting another job you mean doing this through the matrix.json file? I did look at the design of a matrix.json for this (in a separate yaml file, though perhaps could be merged): https://github.com/rocker-org/rocker-versioned2/pull/759/files#diff-454bb856f3821991ff2015ec5ba81c69df2ae3b5a476e30104d697c420b35093, and it hits the same issue with space.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion is simply that it is impossible to move the job to scripts-test.yml.

I don't know if they're sharing capacity on a per-job basis or on a per-workflow basis, but I think it's unlikely that they're sharing it across workflows.
So you probably don't need to split the workflow for a single job here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry to be dense, I think I understand. I've always tended to write actions with basically one job per yaml, unless there were two dependent jobs (e.g. like generate_matrix and build scripts-test.yml). I think I do the same thing with, e.g. writing R functions in many different files in R/. I see the cuda tests as a bit of a work in progress, while for the moment scripts-test.yml has been pretty stable, so conceptually to me anyway this provides a bit more separation between the component I'm still intending to fiddle with and the component that works. Why have all the jobs in the same yaml file?

Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
name: Test cuda builds
on:
workflow_dispatch: null
pull_request:
branches:
- master
paths:
- tests/ml-test.Dockerfile
- .github/workflows/cuda-test.yml
jobs:
build:
runs-on: ubuntu-latest
permissions: read-all
steps:
- uses: actions/checkout@v3
- name: Build the Docker image
run: docker build -f tests/ml-test.Dockerfile .
17 changes: 17 additions & 0 deletions tests/ml-test.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
FROM rocker/cuda:4.3.2

ENV S6_VERSION=v2.1.0.2
ENV RSTUDIO_VERSION=2023.12.0+369
ENV DEFAULT_USER=rstudio
ENV PANDOC_VERSION=default
ENV QUARTO_VERSION=default
ENV PATH=${VIRTUAL_ENV}/bin:${PATH}

RUN /rocker_scripts/install_rstudio.sh
RUN /rocker_scripts/install_pandoc.sh
RUN /rocker_scripts/install_quarto.sh
RUN /rocker_scripts/install_tidyverse.sh

EXPOSE 8787

CMD ["/init"]
23 changes: 2 additions & 21 deletions tests/rocker_scripts/matrix.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@
"install_geospatial.sh",
"install_python.sh",
"install_jupyter.sh",
"install_julia.sh",
"install_cuda-10.1.sh"
"install_julia.sh"
],
"script_arg": [
"none"
Expand All @@ -33,7 +32,7 @@
"base_image": "rocker/r-ver",
"tag": "devel",
"script_name": "install_rstudio.sh",
"script_arg": "daily"
"script_arg": "latest"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unrelated change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know, daily has been failing for a week due the RStudio server throwing a 500 error on that download. Because we try to build that image daily in the cron tasks anyway, I think it's less important to also test it here, especially when it means that the current situation ends up blocking the ability to fix anything. Maybe you can provide your preferred solution?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I'm sorry. I didn't know that the daily build was failing. Indeed, as you say, this is used every day, so we don't have to dare to test it here.

I'm sure that "latest" was tested elsewhere (because latest is the default), so I think it's okay to delete the test itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok thanks. yeah sorry I get the emails every day when the daily fails (https://github.com/rocker-org/rocker-versioned2/actions/workflows/devel.yml) which happens all the time. It looked like this test was building rstudio on r-ver:devel while other tests were building it on the normal r-ver so I didn't delete it, but if you're fine dropping this test then so am I!

},
{
"base_image": "rocker/r-ver",
Expand All @@ -53,30 +52,12 @@
"script_name": "install_quarto.sh",
"script_arg": "prerelease"
},
{
"base_image": "rocker/r-ver",
"tag": "4.0.5",
"script_name": "install_cuda-11.1.sh",
"script_arg": "none"
},
eitsupi marked this conversation as resolved.
Show resolved Hide resolved
{
"base_image": "rocker/r-ver",
"tag": "4.0.4",
"script_name": "install_tensorflow.sh",
"script_arg": "none"
},
{
"base_image": "rocker/cuda",
"tag": "latest",
"script_name": "install_verse.sh",
"script_arg": "none"
},
{
"base_image": "rocker/cuda",
"tag": "cuda11.1",
"script_name": "config_R_cuda.sh",
"script_arg": "none"
},
{
"base_image": "debian",
"tag": "stable-slim",
Expand Down
Loading