-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revise testing infrastructure to decrease spurious failures #759
base: master
Are you sure you want to change the base?
Changes from all commits
2f15536
09a2dad
e7491a6
5422f0d
a1afa20
b9bd40e
57854b9
586739a
6b8ac78
af17389
c7a4f85
9174ee4
26fb6ae
1e5c56d
f19aebe
1c82708
af118f2
48d7500
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
name: Test cuda builds | ||
on: | ||
workflow_dispatch: null | ||
pull_request: | ||
branches: | ||
- master | ||
paths: | ||
- tests/ml-test.Dockerfile | ||
- .github/workflows/cuda-test.yml | ||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
permissions: read-all | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Build the Docker image | ||
run: docker build -f tests/ml-test.Dockerfile . |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
FROM rocker/cuda:4.3.2 | ||
|
||
ENV S6_VERSION=v2.1.0.2 | ||
ENV RSTUDIO_VERSION=2023.12.0+369 | ||
ENV DEFAULT_USER=rstudio | ||
ENV PANDOC_VERSION=default | ||
ENV QUARTO_VERSION=default | ||
ENV PATH=${VIRTUAL_ENV}/bin:${PATH} | ||
|
||
RUN /rocker_scripts/install_rstudio.sh | ||
RUN /rocker_scripts/install_pandoc.sh | ||
RUN /rocker_scripts/install_quarto.sh | ||
RUN /rocker_scripts/install_tidyverse.sh | ||
|
||
EXPOSE 8787 | ||
|
||
CMD ["/init"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -16,8 +16,7 @@ | |
"install_geospatial.sh", | ||
"install_python.sh", | ||
"install_jupyter.sh", | ||
"install_julia.sh", | ||
"install_cuda-10.1.sh" | ||
"install_julia.sh" | ||
], | ||
"script_arg": [ | ||
"none" | ||
|
@@ -33,7 +32,7 @@ | |
"base_image": "rocker/r-ver", | ||
"tag": "devel", | ||
"script_name": "install_rstudio.sh", | ||
"script_arg": "daily" | ||
"script_arg": "latest" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems unrelated change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As you know, daily has been failing for a week due the RStudio server throwing a 500 error on that download. Because we try to build that image daily in the cron tasks anyway, I think it's less important to also test it here, especially when it means that the current situation ends up blocking the ability to fix anything. Maybe you can provide your preferred solution? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I'm sorry. I didn't know that the daily build was failing. Indeed, as you say, this is used every day, so we don't have to dare to test it here. I'm sure that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok thanks. yeah sorry I get the emails every day when the daily fails (https://github.com/rocker-org/rocker-versioned2/actions/workflows/devel.yml) which happens all the time. It looked like this test was building rstudio on r-ver:devel while other tests were building it on the normal r-ver so I didn't delete it, but if you're fine dropping this test then so am I! |
||
}, | ||
{ | ||
"base_image": "rocker/r-ver", | ||
|
@@ -53,30 +52,12 @@ | |
"script_name": "install_quarto.sh", | ||
"script_arg": "prerelease" | ||
}, | ||
{ | ||
"base_image": "rocker/r-ver", | ||
"tag": "4.0.5", | ||
"script_name": "install_cuda-11.1.sh", | ||
"script_arg": "none" | ||
}, | ||
eitsupi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{ | ||
"base_image": "rocker/r-ver", | ||
"tag": "4.0.4", | ||
"script_name": "install_tensorflow.sh", | ||
"script_arg": "none" | ||
}, | ||
{ | ||
"base_image": "rocker/cuda", | ||
"tag": "latest", | ||
"script_name": "install_verse.sh", | ||
"script_arg": "none" | ||
}, | ||
{ | ||
"base_image": "rocker/cuda", | ||
"tag": "cuda11.1", | ||
"script_name": "config_R_cuda.sh", | ||
"script_arg": "none" | ||
}, | ||
{ | ||
"base_image": "debian", | ||
"tag": "stable-slim", | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason why we should keep this in a separate yml file?
Is splitting another job not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, not entirely sure I follow. I think having a separate script is necessary to run on another runner? I find this architectural design easier to understand and maintain -- it's fewer pieces and distinct parts are isolated in different files. By splitting another job you mean doing this through the matrix.json file? I did look at the design of a matrix.json for this (in a separate yaml file, though perhaps could be merged): https://github.com/rocker-org/rocker-versioned2/pull/759/files#diff-454bb856f3821991ff2015ec5ba81c69df2ae3b5a476e30104d697c420b35093, and it hits the same issue with space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion is simply that it is impossible to move the job to scripts-test.yml.
I don't know if they're sharing capacity on a per-job basis or on a per-workflow basis, but I think it's unlikely that they're sharing it across workflows.
So you probably don't need to split the workflow for a single job here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh sorry to be dense, I think I understand. I've always tended to write actions with basically one job per yaml, unless there were two dependent jobs (e.g. like
generate_matrix
andbuild
scripts-test.yml). I think I do the same thing with, e.g. writing R functions in many different files inR/
. I see the cuda tests as a bit of a work in progress, while for the momentscripts-test.yml
has been pretty stable, so conceptually to me anyway this provides a bit more separation between the component I'm still intending to fiddle with and the component that works. Why have all the jobs in the same yaml file?