You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering what steps might need to be taken in order to use Locator with GTSeq microhaplotypes? The genotypes would basically be haplotype numbers. There could be multiple haplotypes (e.g., 4 or 5), so I didn't know exactly how that might affect the model's learning.
Thanks!
-Bradley
The text was updated successfully, but these errors were encountered:
Unfortunately locator's model expects biallelic data only, so can't handle multiallelic sites. If the haplotypes are generated from sequencing or genotyping data, you could enter the individual variant-level data (though this will discard any phase information).
For haplotypes with only 2 alleles you could encode the data as a matrix with sample on rows and haplotypes on columns and entries giving the count of the minor allele (the less common haplotype), using the "--matrix" input option. But if most haplotypes are multiallelic that will throw away much of the information as well.
It's definitely doable to run a locator-like model on haplotype data though -- this just isn't the implementation for it.
If I were to modify the Locator code to accommodate the microhaplotypes, generally what steps would be involved? I am curious because I have a need for this use case, and am considering modifying the Locator code to do so.
Are the steps generally just modifying the input matrix as well as the model architecture to accommodate multiple haplotypes? There may of course be a lot of little things to change, but I guess what I am asking is, given the Locator code, is this feasible or would it be easier to just create a whole new model/ code base?
Hi,
I was wondering what steps might need to be taken in order to use Locator with GTSeq microhaplotypes? The genotypes would basically be haplotype numbers. There could be multiple haplotypes (e.g., 4 or 5), so I didn't know exactly how that might affect the model's learning.
Thanks!
-Bradley
The text was updated successfully, but these errors were encountered: