Skip to content

Commit

Permalink
added example
Browse files Browse the repository at this point in the history
  • Loading branch information
Yossi Farjoun authored and jmarshall committed Nov 13, 2023
1 parent 3dca8b8 commit 4e4bae4
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions VCFv4.4.tex
Original file line number Diff line number Diff line change
Expand Up @@ -582,7 +582,7 @@ \subsubsection{Genotype fields}
\end{itemize}
\item HQ (Integer): Haplotype qualities, two comma separated phred qualities.
\item LAA
\item LAA is a sorted list of $n$ distinct integers, where $1 \le n \le \left|\mathrm{ALT}\right|$, giving the (1-based) indices within ALT of the alleles that are observed in the sample.
In callsets with many samples, sites may grow to include numerous alternate alleles at the same POS.
Usually, few of these alleles are actually observed in any one sample, but each genotype must supply fields like PL and AD for all of the alleles---a very inefficient representation as PL's size is quadratic in the allele count.
Similarly, in rare sites, which can be the bulk of the sites, the vast majority of the samples are reference.
Expand Down Expand Up @@ -611,9 +611,11 @@ \subsubsection{Genotype fields}
4&G&A,T,\textless*\textgreater& LAA:LGT:LAD:LPL& :0/0:30:0\\
4&G&A,T,\textless*\textgreater& GT:AD:PL& 0/0:30,.,..:0,.,.,.,.,.,.,.,.,.,.,.,.,.,.\\
\end{tabular}
\item LAD: See LAA
\item LGT: See LAA
\item LPL: See LAA
\item LAD: is a list of $n+1$ integers giving read depths (as per AD) for the REF allele and each of the local alleles as listed in LAA.
\item LGT: is the genotype, encoded as allele indexes separated by either of $/$ or $\mid$, as with GT, however, the indexes are into the list consisting of REF and the ALTs referenced by LAA.
So that in the case that LAA is 2,3, LGT=0/2 is equivalent to GT=0/3 and LGT=1/2 is equivalent to GT=2/3 (see example above).
\item LPL: is a list of $n+1 \choose \mathrm{Ploidy}$ integers giving phred-scaled genotype likelihoods (rounded to the closest integer; as per PL) for all possible genotypes given the set of alleles defined in the REF and LAA local alleles.
The precise ordering is defined in the GL paragraph.
\item MQ (Integer): RMS mapping quality, similar to the version in the INFO field.
\item PL (Integer): The phred-scaled genotype likelihoods rounded to the closest integer, and otherwise defined in the same way as the GL field.
\item PP (Integer): The phred-scaled genotype posterior probabilities rounded to the closest integer, and otherwise defined in the same way as the GP field.
Expand Down

0 comments on commit 4e4bae4

Please sign in to comment.