Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-identified isotypes #1054

Open
4 tasks
ligalaizik opened this issue Mar 23, 2023 · 15 comments
Open
4 tasks

Un-identified isotypes #1054

ligalaizik opened this issue Mar 23, 2023 · 15 comments

Comments

@ligalaizik
Copy link

Checklist before submitting the issue:

  • The issue is strongly related to the MiXCR software
  • The issue can be reproduced with the most recent version of MiXCR
  • There is no answer to the question in the official documentation and there is no duplicate issue in the bug tracker
  • Inspection of raw alignments with exportAlignmentsPretty shows that data has the expected architecture, and sample preparation artefacts are not the reason of the problem (if this is the matter of the issue)

Expected Result

A description of what you wanted to happen

Actual Result

A description of what actually happened

Exact MiXCR commands

Paste here exact MiXCR commands you used (enclose it with single ` for better formatting)

MiXCR report files

Paste here content of the report files produced by MiXCR (enclose it with triple ``` for better formatting)

@ligalaizik
Copy link
Author

Hi, i used this code to run Mixcr on my BCR-seq data :
mixcr align
-OjParameters.parameters.floatingRightBound=false
-OcParameters.parameters.floatingLeftBound=true
-s mmu -r report_232.txt M2-D1-14-232_S3_L001_R1_001.fastq.gz M2-D1-14-232_S3_L001_R2_001.fastq.gz output_align_232.vdjca

The parameters were added after I realized that there are a lot of sequences in my data which thier isotype was not identified.
But still, with those parameters, I still got 26% of the sequences with an un-identified Isotype, and these un-identified sequences are IgGs ( I'm taking the raw sequences from the fastq files and translating it using expasyTranslation, those sequences have a C region that starts with the amino acids KTT which is the start of the IgGs isotopes).

Is there any way we can fix this? This high amount of unidentified isotopes is strongly affecting our analysis.

Thanks,
Ligal.

@ligalaizik ligalaizik changed the title hi, i used this code to run mixcr on my BCR-seq data : Un-identified isotypes Mar 23, 2023
@mizraelson
Copy link
Member

Hi, can you please share an example of such raw read sequence?

@ligalaizik
Copy link
Author

ligalaizik commented Mar 26, 2023 via email

@mizraelson
Copy link
Member

Hi, I just tried it and MiXCR did align C gene for this pair of reads, and assigned IGHG1 gene. Can you please share the report, exact commands you used (including the export) and mixcr version you use, so I can reproduce it on our end.

@PoslavskySV
Copy link
Member

Just in case, you can check it with VDJ.online:

https://vdj.online/align-result/DTRKMUWTANFHRDNXNOHVGTFPDYHLABJFULWHKUZG

@ligalaizik
Copy link
Author

ligalaizik commented Mar 27, 2023 via email

@mizraelson
Copy link
Member

mizraelson commented Mar 30, 2023

Hi, unfortunately I can't reproduce the issue. The new MiXCR seems to work fine with this pair of reads. I would recommend updating to the latest version. If you need any help with using the new version in you old pipelines - let me know, we can help with column names, etc.

@ligalaizik
Copy link
Author

ligalaizik commented Mar 30, 2023 via email

@mizraelson
Copy link
Member

mizraelson commented Apr 11, 2023

Hi,
So with the latest version of MiXCR (v4.3.2) I use the following command:

mixcr align \
    --preset generic-bcr-amplicon \
    --species mmu \
    --floating-left-alignment-boundary \
    --rigid-right-alignment-boundary C \
    --rna \
    input_R1.fastq.gz input_R2.fastq.gz  \
    result.vdjca

Notice, here i use generic-bcr-amplicon preset which utilizes kAligner2 aligner dedicated for B-cells. In your original command by default you used kAligner1 which was designed for T-cells.

Then,

mixcr exportAlignments -f \
    --drop-default-fields \
    -nFeature {FR1Begin:FR4End} \
    -targets \
    -vHit -dHit -jHit -cHit \
    -vAlignment -dAlignment -jAlignment -cAlignments \
    -nFeature FR1 -nFeature CDR1 -nFeature FR2 \
    -nFeature CDR2 -nFeature FR3 -nFeature CDR3 \
    -nFeature FR4    -aaFeature FR1 -aaFeature CDR1 \
    -aaFeature FR2 -aaFeature CDR2 -aaFeature FR3 \
    -aaFeature CDR3 -aaFeature FR4 \
    result.vdjca \
    alignments.tsv

The command above returns the same columns as the one you used. Practically all parameters are the same, except for --drop-default-fields which overwrites the default set of columns so you only have the ones specified.

@ligalaizik
Copy link
Author

ligalaizik commented Apr 13, 2023 via email

@mizraelson
Copy link
Member

Hi, for the pair of reads you have provided earlier, I have used the commands listed above and the isotype was identified correctly.

image

If you can share a sample file (maybe a part of it if its too big) where you see the issue, I can investigate further.

@ligalaizik
Copy link
Author

ligalaizik commented Apr 16, 2023 via email

@mizraelson
Copy link
Member

Hi, I don't see any attached files. Maybe you can send it by email: [email protected]

@ligalaizik
Copy link
Author

ligalaizik commented Jun 29, 2023 via email

@mizraelson
Copy link
Member

Hi,
To export C gene you can use CRegion gene feature.
E.g.:
-aaFeature CRegion -nFeature CRegion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants