Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to run AUCell on all marker gene sets in Ewings module #985

Open
allyhawkins opened this issue Jan 14, 2025 · 0 comments · May be fixed by #998
Open

Script to run AUCell on all marker gene sets in Ewings module #985

allyhawkins opened this issue Jan 14, 2025 · 0 comments · May be fixed by #998
Assignees
Labels

Comments

@allyhawkins
Copy link
Member

If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.

#696

Describe the goals of the changes to the analysis module.

We would like to be able to run AUCell on all samples in SCPCP000015 with all of the marker gene sets that we are interested in. To do that, we should write a script that takes as input a single SCE file, the max rank (chosen in #984), and a table of marker genes and the associated cell type label and outputs a data frame with the AUC value, the AUC threshold determined by AUCell, and the classification based on that threshold.

For some gene sets, I'm not sure AUCell will do a good job of identifying a threshold, but we should record that information to determine if we need to pick a cutoff to use across all samples for that gene set or if we can use the identified threshold.

What will your pull request contain?

This PR should contain a script that runs AUCell on all marker gene sets for a given SCE object and returns the AUC values and the assigned AUC threshold for each gene set.

I think we might want to include all the marker genes in visser-all-marker-genes.tsv, tumor-cell-state-markers.tsv, and our custom gene signatures in references/gene_signatures. So this PR should also contain a single table with all marker genes described above. I'm not sure that we will find information from each gene set useful just yet, but I think it can't hurt to use them all to start.

Will you require additional software beyond what is already in the analysis module?

No

Will you require different computational resources beyond what the analysis module already uses?

No

If known, when do you expect to file the pull request?

This week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
1 participant