Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running into segfault error 11 #12

Open
Akazhiel opened this issue Oct 9, 2020 · 19 comments
Open

Running into segfault error 11 #12

Akazhiel opened this issue Oct 9, 2020 · 19 comments

Comments

@Akazhiel
Copy link

Akazhiel commented Oct 9, 2020

/Users/jonatan/Micropeptide_project/psiclass/junc /Users/jonatan/Micropeptide_project/EGAF00001593233.sorted.bam -a > ./splice/psiclass_bam_0.raw_splice /Users/jonatan/Micropeptide_project/psiclass/junc /Users/jonatan/Micropeptide_project/EGAF00001593241.sorted.bam -a > ./splice/psiclass_bam_1.raw_splice /Users/jonatan/Micropeptide_project/psiclass/junc /Users/jonatan/Micropeptide_project/EGAF00001593275.sorted.bam -a > ./splice/psiclass_bam_2.raw_splice /Users/jonatan/Micropeptide_project/psiclass/trust-splice ./splice/psiclass_splice.list /Users/jonatan/Micropeptide_project/EGAF00001593233.sorted.bam > ./splice/psiclass_bam.trusted_splice perl /Users/jonatan/Micropeptide_project/psiclass/FilterSplice.pl ./splice/psiclass_bam_0.raw_splice ./splice/psiclass_bam.trusted_splice > ./splice/psiclass_bam_0.splice perl /Users/jonatan/Micropeptide_project/psiclass/FilterSplice.pl ./splice/psiclass_bam_1.raw_splice ./splice/psiclass_bam.trusted_splice > ./splice/psiclass_bam_1.splice perl /Users/jonatan/Micropeptide_project/psiclass/FilterSplice.pl ./splice/psiclass_bam_2.raw_splice ./splice/psiclass_bam.trusted_splice > ./splice/psiclass_bam_2.splice /Users/jonatan/Micropeptide_project/psiclass/subexon-info /Users/jonatan/Micropeptide_project/EGAF00001593233.sorted.bam ./splice/psiclass_bam_0.splice > ./subexon/psiclass_subexon_0.out sh: line 1: 41865 Segmentation fault: 11 /Users/jonatan/Micropeptide_project/psiclass/subexon-info /Users/jonatan/Micropeptide_project/EGAF00001593233.sorted.bam ./splice/psiclass_bam_0.splice > ./subexon/psiclass_subexon_0.out Terminated

I get always this return when trying to run the command ./psiclass --lb bamlist

@mourisl
Copy link
Collaborator

mourisl commented Oct 9, 2020

Can you show me the first few lines of ./splice/psiclass_bam_0.splice file? Can you check whether the genome has very long chromosome names? Thank you.

@Akazhiel
Copy link
Author

Akazhiel commented Oct 9, 2020

For some reason, there's nothing inside in any of the files that are produced. The only file with some contents is psiclass_splice.list. I am attempting to run it in a OSX system.

@mourisl
Copy link
Collaborator

mourisl commented Oct 9, 2020

Can you share the first a few alignments from the bam file? It's strange that there is no ./splice/psiclass_bam_1.raw_splice either. Thanks.

@Akazhiel
Copy link
Author

Akazhiel commented Oct 9, 2020

HWI-ST1243:176:D2B86ACXX:5:1310:20374:92711	117	chr1	9995	0	*	=	9995	0	TTCCGATCTGGTTAGGGTTAGGGTTAGGGTAAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGG	BFFBFFFFFFFFFFFFFFFFFFFBBBFBFFFFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFBBB	ZC:i:6	PG:Z:MarkDuplicates	RG:Z:20130806055857685
HWI-ST1243:176:D2B86ACXX:5:1310:20374:92711	153	chr1	9995	37	101M	=	9995	0	TTCCGATCTCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTTACCCTAACCCTAACCCTAACC	BFFFFFFFFFFFBB<FFFFBBFFFFBBFFFBB<FFFFFFFFIFFFIIIFFIIIFFFFIIIFFBIIIFFFIIIFFFIIFIFBIIIIFFIFFFFFFFFFFBBB	X0:i:1	X1:i:0	ZC:i:6	MD:Z:0G6A0A70A21	PG:Z:MarkDuplicates	RG:Z:20130806055857685	XG:i:0	AM:i:0	NM:i:4	SM:i:37	XM:i:4	XN:i:6	XO:i:0	XT:A:U
HWI-ST1243:176:D2B86ACXX:6:2310:5351:56225	145	chr1	10000	0	101M	chr5	11685	0	ATCTCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC	FBBBFFFBB<BFFFBBFFBBB<FBBBB<FBBBBBFFFFBBFFIFFFIFFB<<IFFFFFFIIFFFIIFFFFFFFFFFFFFFFBIFFFFIFFFFBBFFFFBBB	X0:i:3	X1:i:360	ZC:i:8	MD:Z:2A0A97	PG:Z:MarkDuplicates	RG:Z:20130806033457992	XG:i:0	AM:i:0	NM:i:2	SM:i:0	XM:i:2	XN:i:1	XO:i:0	XT:A:R
HWI-ST1243:176:D2B86ACXX:6:2204:19014:33360	99	chr1	10024	19	101M	=	10125	202	CTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTGACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACC	BBBFFFFFFFFFFIIIIIIIIIIIIIIIIIIIIIIIFFIIIIIIIIIIIFIIIIIIIIIIFB'<BFBBFFFBFBBBBFFBFFFFFBB7<B<<B'077B7BB	X0:i:1	X1:i:0	ZC:i:8	MD:Z:62A38	PG:Z:MarkDuplicates	RG:Z:20130806033457992	XG:i:0	AM:i:19	NM:i:1	SM:i:19	XM:i:1	XO:i:0	XT:A:U```

Here is the first four lines of one of the BAM files.

@mourisl
Copy link
Collaborator

mourisl commented Oct 10, 2020

With the ">" redirecting the output, even if the program "junc" failed, the file ./splice/psiclass_bam_0.raw_splice should be created. Can you check whether the subdirectory "splice" was created on your system, and do you have the write permission on that path? Thank you.

@Akazhiel
Copy link
Author

The .raw_splice files are indeed created, but they are empty, there's nothing inside them. Both subexon and splice subdirectories are created, and I do have permission to write on that path since I'm running this in my own machine.

@mourisl
Copy link
Collaborator

mourisl commented Oct 18, 2020

Can you run this command " samtools view /Users/jonatan/Micropeptide_project/EGAF00001593233.sorted.bam | awk '$6~"N"' | head " to make sure there are spliced reads for the introns? What was the aligner used for the alignment? Thank you.

@Akazhiel
Copy link
Author

Nothing is returned from running that command. As for the aligner, I'm unsure as to which it was used since it wasn't specified and I got the BAM files from a repository, but I think it was done with bowtie.

@mourisl
Copy link
Collaborator

mourisl commented Oct 18, 2020

For RNA-seq data, you need specific aligner such as HISAT, STAR, TOPHAT to allow spliced reads spanning introns, otherwise, the down-streaming assembler could not know the coordinates of the introns.

To be sure of the aligner, you can run the command "samtools view -H XXX.bam", the last few lines usually contain the information of the aligner.

@Akazhiel
Copy link
Author

There's no info about the aligner on the header. The BAM files were produced by running RSEM on the FASTQ files, which uses bowtie as the default aligner. That's what led me to think that was the aligner used since I have no other information.

@mourisl
Copy link
Collaborator

mourisl commented Oct 18, 2020

Thank you. Bowtie can't generate intron information in the BAM files. Probably RSEM uses the local alignment option in bowtie (I guess) so for intronic it would report one of the anchor exons. Could you please run RSEM with STAR, and then sort the bam files?

Thought RSEM has the option for HISAT, but I think it will align the reads to the transcriptome sequence instead of the genome sequence.

@Akazhiel
Copy link
Author

I would love to do that, but unfortunately I don't have access to the FASTQ files, neither to the computational resources that would be needed to perform an alignment for so many samples.

@sagnikbanerjee15
Copy link

Hello,

I am experiencing a very similar problem. I am using only 15 bam files. Out of those 15 bam files, 6 have very few reads (~100K). All the files have some spliced reads (checked using samtools). psiclass fails to generate an assembly and fails on the junc step. This is the error that I am getting:

sh: line 1: 226272 Segmentation fault      (core dumped) /lustre/project/maizegdb/sagnik/FINDER/lib/psiclass_terminal_exon_length_modified/psiclass/trust-splice FINDER_test_ARATH/assemblies_psiclass_modified/combined/splice/psiclass_output_splice_0.list FINDER_test_ARATH/alignments/SRR8422200_for_psiclass.bam > FINDER_test_ARATH/assemblies_psiclass_modified/combined/splice/psiclass_output_bam_0.trusted_splice

I have checked the contents of the file FINDER_test_ARATH/alignments/SRR8422200_for_psiclass.bam and nothing out of the ordinary popped out. I am not sure why the computation fails for this file. I have rerun psiclass several times and each time this same error crops up for the same file (in fact it was the first file in the list of bam files). I reran psiclass without that sample and the same error came up for another file. Is this a problem due to less number of samples?

I moved all my alignment files to another machine and reran the entire process all over again. This time I got a different error.

sh: line 1: 208261 Segmentation fault      (core dumped) /work/LAS/rpwise-lab/sagnik/finder/lib/psiclass_terminal_exon_length_modified/psiclass/classes -p 10 --primaryParalog --lb FINDER_test_ARATH/assemblies_psiclass_modified/fofn -s FINDER_test_ARATH/assemblies_psiclass_modified/combined/subexon/psiclass_output_subexon_combined.out -o FINDER_test_ARATH/assemblies_psiclass_modified/combined/psiclass_output > FINDER_test_ARATH/assemblies_psiclass_modified/combined/psiclass_output_classes.log

Could you please look into this?

Thank you.

@sagnikbanerjee15
Copy link

Quick update. I used some other samples and this time it worked without a glitch. I have experienced that with some samples PsiCLASS behaves rather erractically. It might be a good idea to explore this in depth.

Thanks.

@mourisl
Copy link
Collaborator

mourisl commented Dec 30, 2020

I feel like the issue is that some samples might have no intron information and one of the PsiCLASS modules might fail due to this. I'm currently testing it.

@sagnikbanerjee15
Copy link

I actually checked all the files and all of them had some alignments with intron info.

@mourisl
Copy link
Collaborator

mourisl commented Dec 30, 2020

I think I've found and fix the bug. The bug seems can cause crashing even there are some introns as you mentioned. Could you please give it a try? Thank you.

@sagnikbanerjee15
Copy link

Thank you. I tried the updated files and this time it worked without a glitch. Thanks.

@mourisl mourisl closed this as completed Jan 3, 2021
@sagnikbanerjee15
Copy link

Hello,

I tried it again this time with 3 RNA-Seq samples and I again got the same error with a segmentation fault. Could you please look into it?

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants