Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postanalysis summary statistics not fully reproducible manually #1850

Open
vincentwalter opened this issue Nov 12, 2024 · 0 comments
Open

Postanalysis summary statistics not fully reproducible manually #1850

vincentwalter opened this issue Nov 12, 2024 · 0 comments

Comments

@vincentwalter
Copy link

vincentwalter commented Nov 12, 2024

Expected Result

I expect the CDR3 metrics exported by postanalysis to be reproducible manually.

Actual Result

I can reproduce the length summary statistics but not the biochemical properties

Exact MiXCR commands

Using MiXCR v4.7.0 (built Wed Aug 07 21:19:48 CEST 2024; rev=976ba14139; branch=no_branch; host=fv-az1019-185) I manually exported like this:

mixcr exportClones -f \
  --chains IGH   \
  --filter-stops \
  --filter-out-of-frames \
  --drop-default-fields \
  -uniqueTagCount Molecule \
  -aaFeature CDR3 -aaLength CDR3 \
  -nFeature CDR3 -nLength CDR3 \
  -baseBiochemicalProperties CDR3 \
  example.clns \
  example_igh_manual.tsv

mixcr postanalysis individual \
  --default-downsampling none \
  --default-weight-function UMI \
  --only-productive \
  --chains IGH \
  example.clns \
  postanalysis/example.json.gz

And then compared the data in R version 4.4.2.
I can reproduce the CDR3 lengths but not the biochemical properties.

manual <- read.delim("example_igh_manual.tsv")

postan <- read.delim("postanalysis/example.cdr3metrics.IGH.tsv")
#> Warning in read.table(file = file, header = header, sep = sep, quote = quote, :
#> incomplete final line found by readTableHeader on
#> 'postanalysis/example.cdr3metrics.IGH.tsv'

#----CDR3 Length----
weighted.mean(manual$nLengthCDR3, manual$uniqueMoleculeCount)
#> [1] 49.68081
postan$Length.of.CDR3..nt
#> [1] 49.68081

weighted.mean(manual$aaLengthCDR3, manual$uniqueMoleculeCount)
#> [1] 16.56027
postan$Length.of.CDR3..aa
#> [1] 16.56027

#----Biochemical Properties----
weighted.mean(manual$CDR3Charge, manual$uniqueMoleculeCount)
#> [1] -0.006800394
postan$Charge.of.CDR3
#> [1] -0.03862589

weighted.mean(manual$CDR3N2Volume, manual$uniqueMoleculeCount)
#> [1] 0.6432407
postan$N2Volume.of.CDR3
#> [1] 2.978124
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant