Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is XS tag indispensable for strand-specific RNA-seq library? #25

Open
Pentayouth opened this issue May 2, 2022 · 17 comments
Open

Is XS tag indispensable for strand-specific RNA-seq library? #25

Pentayouth opened this issue May 2, 2022 · 17 comments

Comments

@Pentayouth
Copy link

Dear author,

I would like to compare the assembly of both psiclass and stringtie.

In stringtie, the user could specify --rf or --fr for strand-specific RNA-seq library instead of output XS tag in the STAR alignment step. So I didn't use --outSAMstrandField intronMotif and thus my bam files do not have XS tag.

I wonder if such bams would influence the psiclass assembly? Or if adding XS tag to bam outputs is indispensable regardless of the strandness of the experiment? Is there any workaround instead of performing the time consuming STAR alignment steps (I have hundreds of samples)?

Best regards,
Wang

@mourisl
Copy link
Collaborator

mourisl commented May 2, 2022

We have provided the program "addXS" in the package. It adds the "XS" field by checking the donor/acceptor motifs. The command is
"samtools view -h in.bam | ./addXS reference_genome.fa | samtools view -bS - > out.bam"

But this still takes a while to generate, because it needs to decompress and compress the BAM file. I can add the strand-specific feature, and it should not take long.

@mourisl
Copy link
Collaborator

mourisl commented May 2, 2022

Just want to confirm, are the samples under the same strand library? Or it is a mixture? Thank you.

@Pentayouth
Copy link
Author

All the samples are the same strand library.

@mourisl
Copy link
Collaborator

mourisl commented May 3, 2022

Thanks for the information. I've added the option --stranded to psiclass in the git branch "stranded". Could you please checkout the branch and test whether PsiCLASS generates reasonable results? If so, I will merge this updates to the master branch. You can specify the strand library through the option like "--stranded rf" or "--stranded fr". Thank you!

@Pentayouth
Copy link
Author

The branch didn't work properly, I ran:
/public/home/lijing/wangzw/resource/bins/psiclass/psiclass --lb bam.list -p 1 --stranded rf
which threw an error

$ /public/home/wang/resource/bins/psiclass/psiclass --lb bam.list -p 1 --stranded rf
sh: /public/home/wang/resource/bins/psiclass/samtools-0.1.19/samtools: No such file or directory
Found mate read id index suffix(.1 or /1). Calling "--mateIdx 1" option. If this is a false calling, please use "--mateIdx 0".
/public/home/wang/resource/bins/psiclass/junc /public/home/wang/subject/star_new/N1/N1.2pass.Aligned.sortedByCoord.out.bam -a  --stranded rf --hasMateIdSuffix > ./splice/psiclass_bam_0.raw_splice
sh: /public/home/wang/resource/bins/psiclass/junc: No such file or directory
Terminated

@mourisl
Copy link
Collaborator

mourisl commented May 3, 2022

It seems the program junc and samtools are not compiled. Could you please run "make" to generate those executables? Thank you.

@Pentayouth
Copy link
Author

I'm sorry for forgetting the make step.
Now psiclass is working properly. To my experience the whole process would take 3-4 days on using 23 threads and I will give you feedbacks then.
I really appreciate your continuous support of the program.

@Pentayouth
Copy link
Author

Pentayouth commented May 5, 2022

I run

/public/home/lijing/wangzw/resource/bins/psiclass/psiclass \
--lb bam.list \
-p 24 \
--stranded rf

the gffcompare result of the psiclass_vote.gtf gave weird results, showing low specificity even at intron level, which is abnormal.
图片
below is stringtie merge
图片

I checked igv and found the software gave assemblies at the opposite strand.
图片
图片
图片

@Pentayouth
Copy link
Author

Pentayouth commented May 5, 2022

btw, I checked the library strandness using RSeQC

samtools view -Sbh N1_WTS.bam chr22 > chr22.old.bam
infer_experiment.py \
-r gencode.v24.chr_patch_hapl_scaff.annotation.12.bed \
-i chr22.old.bam

the result is:

# This is PairEnd Data
# Fraction of reads failed to determine: 0.0515
# Fraction of reads explained by "1++,1--,2+-,2-+": 0.0318
# Fraction of reads explained by "1+-,1-+,2++,2--": 0.9167

according to this figure (from RSeQC)
图片

my library is 1+-,1-+,2++,2--, which means my library is fr-firststrand (aka RF)
图片

so I used --rf in stringtie and --stranded rf in psiclass

@mourisl
Copy link
Collaborator

mourisl commented May 5, 2022

Thank you for showing the details. It seems some of the introns are on the right strand while most of them are not. In my test data (even not a stranded library), the strand is the same between PsiCLASS and stringtie. I'll look into this issue by creating a better debugging example. If the chr22.bam file is small, I would appreciate it if you can share the file with me. Thank you!

@Pentayouth
Copy link
Author

Ok, I would like to share 3 normal bams covering chr22 with you, all are stranded library. Would you please provide an email address?

@mourisl
Copy link
Collaborator

mourisl commented May 5, 2022

Yes, you can use the email [email protected] . One bam file would be sufficient. Thank you!

@Pentayouth
Copy link
Author

Please check the email for the download link.

@mourisl
Copy link
Collaborator

mourisl commented May 6, 2022

Thank you for providing the test examples! I think I've fixed this issue. Could you pull the new branch, recompile PsiCLASS and give it a try? Thank you for your patience and help.

@Pentayouth
Copy link
Author

Thank you. I will give you feedbacks.

@Pentayouth
Copy link
Author

The gffcomapre result looks plausible this time. Thank you very much.
图片

@mourisl
Copy link
Collaborator

mourisl commented May 7, 2022

Thank you! I will merge this branch to master and release a new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants