-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to complete command task: 'rm_individual_seg_files' launched from master workflow, #6
Comments
Interesting to hear that someone is trying Accucopy on a dog tumor!
Finally! That's why we spent effort to make it possible for users to make
their own reference genomes.
It is a never-seen-before error. It looks like Accucopy failed in the
middle of deleting intermediate segmentation files (one file per
chromosome) because the "rm" command was provided with empty arguments, "
rm: missing operand ". These files were being deleted because a prior step
should have already combined them into one file.
I am not sure what will happen if the snp_sites.gz is missing. Is there a
high-quality SNP-sites file for dogs? I recall, information from
low-quality SNPs will cause Accucopy to misbehave. That's why we use this
file to exclude them.
Did you modify other parts of our workflow/code? Can you provide us with
the log folder (containing pyflow logs, and stdout /stderr of other
programs)?
…--
Yu Huang
Professor, State Key Laboratory of Drug Research, Shanghai Institute of
Materia Medica, CAS
http://www.yfish.org/
https://sites.google.com/site/polyactis/
On Wed, Nov 18, 2020 at 3:58 PM yu052 ***@***.***> wrote:
Dear author,
I built the singularity environment from the Docker image recently. I got
this errror as mentioned in the title when I run it. The detailed error
information is as follows:
[2020-11-17T17:07:15.734560] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] Worklow terminated due to the following task
errors:
[2020-11-17T17:07:15.735768] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] Failed to complete command task:
'rm_individual_seg_files' launched from master workflow, error code: 1,
command: 'rm'
[2020-11-17T17:07:15.736237] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] [rm_individual_seg_files] Error Message:
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] [rm_individual_seg_files] Last 2 stderr lines from
task (of 2 total lines):
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975587] [node127.cm.cluster]
[68840_1] [rm_individual_seg_files] rm: missing operand
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1]
[WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975877] [node127.cm.cluster]
[68840_1] [rm_individual_seg_files] Try 'rm --help' for more information.
Do you have any clue why this error happened? Can you please help me to
solve it?
Additional information: I was implementing the program on a pair of WGS of
canine tumour and normal tissue. The required reference files were properly
made, except the snp_sites.gz file. But, I removed the option --callRegions
of the SNP calling step using the Strelka from the main.py file. So the
program can still work without the snp_sites file.
Regards,
Yun
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF7C2MU3EEVH7WPKY2X5Z3SQN5DPANCNFSM4TZTV5WA>
.
|
Thanks for your response! I have a SNP-sites file for dogs, but I decided to disable the --callRegions because I got a strange error from that file which was that the chr33 in snp_sites.gz was not found in the reference genome. Maybe disabling that option was a bad idea. But do you have any clue why that particular chr33 can not be found in the reference genome? I checked the reference genome file, the chr33 was there I think. Thanks for your help in advance! Regards, |
Did you check if chr33 is in your genome index files (i.e. genome.dict and etc.)? |
We saw this in the pyflow_tasks_stdout_log.txt, which suggests all reads in your bam fail to pass our filters:
Our rust program contains these filters. Can you check your bam to see why all reads fail to pass these filters? if record.mapq()<30 {
continue
}
if record.is_paired() && ( record.insert_size()<0 || record.insert_size()>self.max_fragment_len as i64 ||
!record.is_proper_pair() || record.is_mate_unmapped() || !record.is_first_in_template() ||
record.is_secondary() || record.is_duplicate() || record.is_supplementary() ) {
continue;
} |
I checked that chromosome 33 is indeed in the genome.fa, genome.dict, and genome.fa.fai. KInd regards |
Then it probably failed in these filters. You can check
https://samtools.github.io/hts-specs/SAMv1.pdf on how to know if a read is
properly paired, if its mate is unmapped, what the insert size (fragment
length) is and if it is over 1000 (max length set in your program), if it's
duplicated, if it's supplementary, if it's secondary.
Most of the info is in the FLAG column. it's coded in binary bits. Need
some knowledge on the conversion between binary and decimal numbers. You
may have to ask a computer scientist regarding how to decode the info.
if record.is_paired() && ( record.insert_size()<0 ||
record.insert_size()>self.max_fragment_len as i64 ||
!record.is_proper_pair() || record.is_mate_unmapped()
|| !record.is_first_in_template() ||
record.is_secondary() || record.is_duplicate() ||
record.is_supplementary() ) {
continue;
}
…On Tue, Nov 24, 2020 at 11:34 PM yu052 ***@***.***> wrote:
I checked that chromosome 33 is indeed in the genome.fa, genome.dict, and
genome.fa.fai.
It is weird that all the reads in the bam failed to pass the filters,
isn't it? I confirmed that most reads have mapq 60.
Do you have any other clue why I got these errors?
KInd regards
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF7C2NVOLUW44MPXINLN7TSRPG7DANCNFSM4TZTV5WA>
.
|
Thanks for your response! |
You can copy the independent Strelka into the docker and overwrite the docker version and ran it inside the docker to see if anything strange. You bam files identify chromsomes as "chr1", not "1", right? Our program assumes "chr1", not "1". |
Is it possible to make your program compatible for both format? |
Both formats? Bam and ?
…On Sat, Dec 5, 2020, 4:58 PM yu052 ***@***.***> wrote:
Is it possible to make your program compatible for both format?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAF7C2KIOEUDDGD4Y6IBD6TSTHY35ANCNFSM4TZTV5WA>
.
|
Sorry that I didn't make it clear. I mean the format of the name of the chromosome, chr1 and 1. |
Dear author,
I built the singularity environment from the Docker image recently. I got this errror as mentioned in the title when I run it. The detailed error information is as follows:
[2020-11-17T17:07:15.734560] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Worklow terminated due to the following task errors:
[2020-11-17T17:07:15.735768] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] Failed to complete command task: 'rm_individual_seg_files' launched from master workflow, error code: 1, command: 'rm'
[2020-11-17T17:07:15.736237] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Error Message:
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [rm_individual_seg_files] Last 2 stderr lines from task (of 2 total lines):
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975587] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] rm: missing operand
[2020-11-17T17:07:15.736848] [node127.cm.cluster] [68840_1] [WorkflowRunner] [ERROR] [2020-11-17T15:50:23.975877] [node127.cm.cluster] [68840_1] [rm_individual_seg_files] Try 'rm --help' for more information.
Do you have any clue why this error happened? Can you please help me to solve it?
Additional information: I was implementing the program on a pair of WGS of canine tumour and normal tissue. The required reference files were properly made, except the snp_sites.gz file. But, I removed the option --callRegions of the SNP calling step using the Strelka from the main.py file. So the program can still work without the snp_sites file.
Regards,
Yun
The text was updated successfully, but these errors were encountered: