-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regenerate wilms notebooks: Code updates #913
Regenerate wilms notebooks: Code updates #913
Conversation
…00 cells for certain steps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments about the approach here. Based on #906 (comment), I was thinking we would completely remove generating these notebooks from the workflow. Since we have a script that is already used to run two scripts and generate the notebooks then I don't think it needs to be present in the main workflow.
The other comment is about how we organize the exploratory analysis. Since the supplemental-notebooks
exists, then we should put everything that's exploratory/ not run in the workflow in that folder.
# These steps are only run if RUN_CNV_EXPLORATORY is true | ||
if [[ $RUN_CNV_EXPLORATORY -eq 1 ]]; then | ||
# Create the pooled normal reference for inferCNV | ||
Rscript scripts/06b_build-normal_reference.R | ||
|
||
# Run infercnv and copykat for a selection of samples | ||
# This script calls scripts/05_copyKAT.R and scripts/06_infercnv.R and associated exploratory CNV notebooks in `cnv-exploratory-notebooks/` | ||
# By default, copyKAT as called by this script uses 32 cores | ||
THREADS=${THREADS} TESTING=${TESTING} ./scripts/explore-cnv-methods.sh | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I almost think we could just remove this from the main workflow rather than do this. It looks like you are just running a single script that already runs two other scripts to actually generate the notebooks so I don't see why this is needed at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No complaints from me.
|
||
- `scripts` contain analysis scripts used in the module workflow. | ||
See `scripts/README.md` for more information | ||
- `notebook_template` contains R Markdown notebooks meant to be run as a template across module samples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small annoyance with the fact that this is _
and not -
when all the other folder names have -
, but not worth the large diffs to fix 😢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should now all be _
:)
- `cnv-exploratory-notebooks` contains R Markdown notebooks and their HTML outputs specifically from exploratory steps during CNV analysis in the module workflow | ||
- `supplemental-notebooks` contains exploratory notebooks comparing Azimuth label transfer to an Azimuth-adapted approach which is used in this module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So since we have this supplemental folder that contains notebooks not used as part of the workflow and are also exploratory why don't we move them into this folder instead? So you would have exploratory-notebooks
as a folder and then maybe two folders inside that, one for cnv
and one for azimuth
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but... for context, I originally did something sort of like this #912, and the diff resulting from moving notebooks was simply not viewable on any platform. So, I decided to let it go.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this file being changed? Github won't let me preview the changes but I don't think it should be changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…y, and underscorify some folders. READMEs and related scripts updated to match these changes
Fresh changes:
I think I recommend looking directly at the two commits I added since your last review to review this, since it will make identifying the specific changes easier. Again, please let me know where I can clarify! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New changes look good 👍
This is the first of 2 PRs to address #906 (but might be more if GitHub UI gets mad) to update notebooks in the
cell-type-wilms-tumor-06
. This PR covers a smidge of reorg and code changes that cropped up while regenerating notebooks with the 2024-11-25 data release.cnv-exploratory-notebooks
for them and their output to live in. I also had to update the path accordingly inscripts/explore-cnv-methods.sh
.00_run_workflow.sh
script to specifically handle logic for regenerating these notebooks. Even though we will want to run other exploratory steps in the future, we almost certainly won't want to run these!00b_characterize_fetal_kidney_reference_Stewart.Rmd
out of notebook_template and into notebook, since it is not a template.00_run_workflow.sh
with the right new path, and also I also renamed that script variablenotebook_output_dir
-->notebook_dir
, since the directory is named "notebook".scripts/06b_build-normal_reference.R
was missing from the workflow when running the workflow through! The script also needed a Seurat library load which snuck past during review because this module was mostly not being run in CI. The script is now run, loads Seurat, and was styled by pre-commit.Please let me know what I can clarify!!