-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal error, distance matrix is too sparse #16
Comments
This means that the distance matrix does not have enough information to build a guide tree. Try using the -sensitive or -verysensitve option for reseek when making the matrix. If that doesn't work either, the structures are highly divergent and you probably won't get a meaningful MStA. |
Thank you, that makes sense! Perhaps off topic then, but I am working with
a directory of ~800 models-- do you know of a way to cluster them from
sequences in .pdb file (rather than going back to the .fasta and manually
removing files from the directory)?
…On Thu, Nov 28, 2024, 10:20 AM Robert Edgar ***@***.***> wrote:
This means that the distance matrix does not have enough information to
build a guide tree. Try using the -sensitive or -verysensitve option for
reseek when making the matrix. If that doesn't work either, the structures
are highly divergent and you probably won't get a meaningful MStA.
—
Reply to this email directly, view it on GitHub
<#16 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AICFWVXHCGJOKH6B72HWCGD2C4YDXAVCNFSM6AAAAABSVKXLASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBWGM2TENJWGU>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Something like this usearch -cluster_fast pdb.fasta -id 0.5 -centroids centroids.fasta grep '^>' centroids.fasta | tr -d '>' > pdbids.txt for label in `cat pdbids.txt` do mv -v $label.pdb ../some_other_directory done |
Thank you for the suggestion!
Btw, the version of muscle5 I am running was installed with conda.
I did as you suggested above, but still had the following error:
(muscle) ***@***.*** ~]$ muscle -super7 ThyAs.mega -distmxin
ThyAs.distmx -reseek -output ThyAs.afa
muscle 5.3.linux64 [-] 1584Gb RAM, 56 cores
Built Nov 11 2024 08:05:12
(C) Copyright 2004-2021 Robert C. Edgar.
https://drive5.com
[super7 ThyAs.mega]
WARNING: Duplicate sequences found
Input: 622 seqs, avg length 288, max 642, min 99
00:00 26Mb Reading dist mx (reseek format)...0 pair-wise distances
…---Fatal error---
Distance matrix too sparse
(muscle) ***@***.*** ~]$
On Fri, Nov 29, 2024 at 1:10 PM Robert Edgar ***@***.***> wrote:
Something like this
usearch -cluster_fast pdb.fasta -id 0.5 -centroids centroids.fasta
grep '^>' centroids.fasta | tr -d '>' > pdbids.txt
for label in `cat pdbids.txt`
do
mv -v $label.pdb ../some_other_directory
done
—
Reply to this email directly, view it on GitHub
<#16 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AICFWVQVIXL4WJTT6SYGNPT2DCUZFAVCNFSM6AAAAABSVKXLASVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMBYGIZDMNZZGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
How can I rectify output from command:
muscle -super7 ThyAs.mega -distmxin ThyAs.distmx -reseek -output ThyAs.afa
producing
---Fatal error---
Distance matrix too sparse
?
The text was updated successfully, but these errors were encountered: